- Sandhill Report
- Posts
- The Everything Stack After Agents
The Everything Stack After Agents
Hypothesizing. To-Do Lists. Agents. Applications. (Bonus: Visibility)
Excited to take time to write more thoughtfully here. Comments/Ideas/Critiques appreciated if you have them. Still running live events for interesting topics/guests: Tune in tomorrow at 2pm EST for the Venture Research Roundtable
As AI infrastructure improves, AI Agents are set to change the software application ecosystem dramatically. In the future, Agents will use applications more than humans.
An AI agent is software that autonomously performs tasks by understanding goals, making decisions, and taking actions. Unlike traditional software that follows fixed rules, agents use AI (typically language models) to interpret instructions, adapt to new situations, and execute complex workflows.
Agents combine three key capabilities:
Understanding - interpreting natural language instructions and context
Decision-making - determining the right steps to achieve a goal
Execution - actually performing actions across different tools and systems
To date, applications have been built around humans who want pretty buttons and can easily be overloaded with information. Agents don't care about what the buttons look like and can handle more and more information. All applications live on a spectrum from low information density to high information density:
Apple’s minimalist “low information density” consumer products
Bloomberg’s “high information density” focused enterprise products
The rise of Agents will push application design to the extremes.
Human oriented applications will be extremely simple. Agent oriented applications will be extremely dense. The further this Agents progress the less middle ground there will be.
Agents will push most applications to the extreme of information density. APIs are that extreme.
Applications that humans interact with will push further towards simplicity and minimalism. Simple chat interfaces and search bars are that extreme.
Agent ability to navigate multiple applications to perform a task will result in less applications that humans interact with overall.
Consolidation of applications for humans.
Fragmentation of applications for agents.
Small number of apps that only surface the most relevant information dominate.
Agents use a wide variety of apps that they interact with via info-dense APIs.
The End State has 4 distinct areas: Hypothesizing, To-do Lists, Agents, and Applications.
Hypothesizing is the very human act of coming up with net-new approaches to something and is under-appreciated as a mysterious and magical event. Here are some passages that articulate this phenomenon:
The real purpose of scientific method is to make sure Nature hasn’t misled you into thinking you know something you don’t actually know. Hypothesis—testing of data—conclusion. But it rests on a faith that the hypothesis is worth testing. That comes from somewhere else. It’s that Quality, the leading edge of reality, that makes you create the hypothesis in the first place.
The supreme task of the physicist is to arrive at those universal elementary laws from which the cosmos can be built up by pure deduction. There is no logical path to these laws; only intuition, resting on sympathetic understanding of experience, can reach them.
Faith. Leading edge of reality. Pure deduction. Only intuition. No logical path. Sympathetic understanding.
This is human stuff. This will stay human stuff.
You can argue brute force hypothesis testing. You can argue “what if AGI blank” angles.
Even in an all-knowing AGI future, humans will want to decide on the direction of things. Deciding which hypotheses to test is at the heart of that. Even if AGI could decide, the most we’d probably want their involvement to amount to is to give them a seat at the hypothesis creation table.
You have friends that are smarter than you. Do you follow their advice to the letter or simply take it as one of many inputs to your decision making?
After hypothesizing, humans need to communicate what they want to try with Agents to get it done. Communicating what needs to get done isn’t actually very easy. Being specific. Outlining the right process. Cutting unnecessary requirements. Reevaluating goals based on new data from first actions.
Communication with Agents boils down to a list of action items otherwise known as a To-Do List.
To-do lists are a broader category of software than you might initially think. There’s your apple notes list with the little check box next to it. There are also billions of dollars worth of software market cap built on helping people create lists of action items. Any software that organizes actions is a form of to-do list. Some are more complicated than others, but it’s all the same base category. This is a place where AI can help even more than with Hypothesizing, but it’s a collaboration with AI - not an automated process for the same reasons I stated in the Hypothesizing section.
Examples from Claude of valuable To-Do Lists
Once action items are created, in a future where Agents continue to progress, that’s the end of the human part. Agents will perform tasks using a variety of applications and then return the output to humans who will then review, decide on new hypotheses, new to-dos, etc. This already happens today in human organizations:
I have 60 direct reports. They're on staff because they're world class at what they do - they do it much better than I do. I have no trouble interacting with them. I have no trouble prompt engineering them. I have no trouble programming them. I think that's what people are going to learn - they're all going to be CEOs.
Agents will live between To-Do Lists and all other Applications where human labor used to be:
This is the world we live in today
Agents are Labor. Humans are Managers.
This is the world after Agents
There will be a lot of roundtripping between the different zones the same way there is iteration within any organization. Managers decide tasks based on hypothesis. Give tasks to labor. Labor works and reports back. Managers adjust hypothesis. Labor works and reports back. Managers adjust hypothesis. Labor delivers final output to managers. Managers decide new tasks. Repeat.
Again, this is not dissimilar from how organizations operate today as they set new goals and refine processes as new data emerges
There are new opportunities in each area. Some seem more attractive than others.
Hypothesizing
LLMs like Claude, which I used to help me write this post, is a good assistant here. This seems to be a continuation of the famous "bicycle for the brain” quote that outlined Steve Jobs vision for personal computers generally. As Jobs would agree, the computer (even with Claude) does not replace the brain - it merely accelerates it. Despite years of software development in the area, most of productive hypothesizing is still driven by very human things like in-person interactions, long walks, and the right amount of caffeine.
Due to the massive amount of capital required to build these LLMs and their generalist nature, this area is mostly spoken for and has clear leaders/winners. Not much is changing here in terms of overall process. The same people decide what to do - they just decide with help from a combination of both humans and AI. Maybe you need less people, but you still need people.
To-Do Lists
As mentioned previously, this area has been built out aggressively in the old paradigm of organizing human labor. With Agents, this area seems primed for new solutions that combine human inputs (hypotheses) with the coordination of agents rather than humans. Incumbents built around a completely different focus (human labor) do not seem to have a considerable advantage outside of resources and distribution. The core technology needs to be reimagined.
To-do lists will likely mirror search engines in consumer markets and enterprise software in business markets. For consumers, like Google dominates search, a single platform will likely emerge as the primary interface for delegating tasks to AI. In business markets, we'll see both broad platforms that coordinate across functions (like Salesforce) and specialized vertical solutions (like Veeva for life sciences). While vertical solutions will thrive in specific industries, the platforms that can coordinate across functions will capture the most value - following the pattern we saw with SaaS, where horizontal players like Salesforce and Atlassian grew larger than vertical specialists.
This area offers classic venture dynamics. Upfront investment to dominate a category. Most fail. Winners enjoy stickiness and decades of high-margin profitability.
One Consumer Agent Coordinator to rule them all?
Agents
While many startups are building AI agents, it's unlikely we'll see a single dominant player in this space. The value will accrue to the layer that coordinates these agents (the To-Do List layer) rather than to individual agents themselves, similar to how managers capture more value than individual workers. The best agents will be specialized - like skilled workers - but need coordination from above to be truly effective.
The distinction between agents and their coordination layer (To-Do Lists) may initially be blurred within industry-specific tools. Think of how Salesforce started with sales but expanded to coordinate across departments. Similarly, while agents might start specialized within verticals like finance or development, they'll need the ability to work across functions - just like how human teams collaborate across departments without always going through management.
Vertical agents with custom coordination tooling built on top are able to stretch up into To-Do List zone, but are still best coordinated from a more generalized To-Do List that works across functions.
To further explore the lack of long-term retained value of Agents we can look harder at the idea of Agents replacing Humans as Labor.
High-skill human labor maintains value through inherent scarcity - there are only so many expert developers or specialized surgeons. But Agents, regardless of capability, are infinitely reproducible digital assets.
Agent creators may attempt artificial scarcity through exclusive API access or specialized training data, similar to OpenAI's GPT-4 vs GPT-3.5 pricing model. However, this approach mirrors cloud computing's evolution, where early providers tried maintaining high margins through differentiation. Like compute costs' inevitable 'race to zero', agent capabilities will become commoditized as core technologies standardize and proliferate.
Agents run on language models, which run on compute.
In a world of infinite intelligence, any single unit of it becomes worthless.
This doesn’t mean that Agents won’t be extremely valuable within organizations. That value will simply not translate outside the organization. To make another human analogy:
An ML engineer who builds a recommendation algorithm that generates $100M in revenue for Netflix isn't worth $100M to Disney. Their value comes from applying skills within Netflix's specific data, infrastructure and business context. Similarly, agents trained on an organization's workflows create value that's tied to that context and can't be easily transferred or monetized externally.
Applications
I have 8 separate applications running on my desktop right now. One of those applications is Chrome which is visibly running 9 different web applications. In the future, the number of applications a human interacts with directly will decrease and be pushed under the Agent layer. In the extreme, a human may only interact directly with two applications:
Hypothesizing App
To-Do List App
All other applications and the actions taken with them will live below the agent layer. This means that every application that lives below that agent layer will either be used by “computer control” agents or be reimagined to serve its new primary user: Agents.
Application competition today is shaped by human constraints. Apps compete on UX, habit formation, and system-of-record lock-in. But agents don't have these constraints - they can handle unlimited information density and aren't subject to muscle memory or workflow habits.
When agents become primary users, applications compete purely on effectiveness. This creates a fundamentally different competitive landscape. Instead of sticky platforms, we'll see a fragmented market of specialized tools optimized for specific tasks. Think of the difference between the applications you use in your day-to-day (built for human information processing) and an API (built for machine consumption).
This fragmentation happens because:
Agent-facing apps are easier to replicate (no complex UI needed)
Switching costs disappear (agents can instantly adapt to new interfaces)
Lock-in through stored data weakens (agents can easily port data between systems)
The result looks more like today's API ecosystem than the consumer app market - many players competing on pure utility rather than user experience.
For incumbents, this creates a classic innovator's dilemma. Their existing applications are optimized for human users - clean interfaces, manageable information density, familiar workflows. But agent-facing versions of these same tools would look radically different: pure functionality, maximum data throughput, API-first design.
Companies like Stripe show what adaptation looks like. While maintaining their human interface, they've built parallel infrastructure specifically for AI consumption. Others will need to follow, effectively maintaining two products: their human-facing application and an agent-oriented version.
The risk isn't just competition - it's irrelevance. As agents handle more tasks, applications that don't adapt will see declining usage regardless of their current market position. Their moats of user habit, interface design, and workflow integration mean little to an agent. Just as mobile killed companies that failed to adapt from web, the shift to agent-centric applications will likely determine the next generation of software winners and losers.
If you’re building something new, this feels like a tough place to hang out.
Bonus: Visibility Through The Stack
As I get to the end of this and think about the world I described above, it's clear that there's a large category missing: visibility into agent behavior.
Just as managers need to understand what their teams are doing, humans will need visibility into agent actions and decision-making. The stack I've outlined will include some native visibility features - To-Do Lists will track workflow progress, agent platforms will have basic monitoring, and applications will show agent usage patterns.
However, like today's tech stack, dedicated visibility tools will emerge. Drawing parallels to existing tools helps illustrate what's needed:
Cross-stack observability platforms (like Datadog for infrastructure) will track agent resource usage, model calls, and coordination
Security and compliance tools (like Sentry for errors) will monitor agent decisions and detect anomalies
Performance optimization tools (like Amplitude for analytics) will measure agent KPIs and identify improvement opportunities
The depth needed for agent visibility exceeds current tools that were built around humans. We're not just tracking what happened, but how and why agents made specific decisions. This creates opportunities for both integrated features within our stack layers and standalone products that specialize in agent observability.
As agent adoption grows, visibility tools will become critical for maintaining control and trust in agent-driven workflows. This is a key point for Human-AI interaction and we will see classic venture dynamics similar to the To-Do List area.
Thank you to Manil and Pranav from Foundry (YC F24) for catalyzing some of my thoughts on this visibility section.
In Conclusion
First of all - this has been a great exercise for me as I think through what the future looks like with current tech trends. I’d recommend it. I don’t hold these views as strongly as they’re written above, but it’s a good starting point. I do have stronger opinions now than I did when I started writing this post.
I’ll be interested to see how quickly we move in the direction I outlined. I’d be surprised if it happened in less than a decade considering the habit changes that are required. Even if the technology was available today, it takes a long time for humans to switch over to anything new. Not startup/tech crazed humans (like you), which are a very small subset. Humans generally. Maybe AI makes that faster too, but probably not that much faster.
This next decade seems like a great time to make money as an early adopter of AI that delivers products/services that are still priced assuming human labor inputs. It will just take a while for buyers to adjust their perceived value of deliverables. Information driven projects that used to take 10 hours can take 2 hours and you can charge the same.
On the venture investing side - I mentioned it above, but I see the To-Do List layer as the most interesting place to hang out. I think there’s room for a new Google type outcome for someone who wins the “Agent Action Bar” for consumers. I think there’s an AI-first Salesforce outcome. I think Vertical Agents can be the new Vertical SaaS as long as it pushes hard into the To-Do List layer and doesn’t get caught building Agents. I also think that Agent Visibility will be a big new thing.
A key theme across all those categories is that they’re a important Human-AI interaction point. The more “pure human” it gets or the more “pure AI” it gets, the less room I see for monopoly margins.
Will have to check back in a decade.
Also, this got long and I appreciate you reading this far :)
Talk soon,
Adam