Open source agents need memory, protocols, and security to work

Sequoia's Konstantin Buhler reveals three blockers: agent memory of themselves, communication protocols like TCP/IP, and 1000 security agents per cognitive agent.

Agents must remember themselves, not just users, to maintain consistency

Konstantin Buhler from Sequoia identifies agent memory as the first unsolved problem preventing real agents from working—not just memory of users but memory of themselves, because

"if every time you met me I was a different personality, it would be pretty weird."

The challenge extends beyond simple key-value caching or session persistence to fundamental questions of identity: agents need consistency like humans have personality, requiring both fine-tuning for who they are and real-time memory updates stored like RAM versus hard drives. Nvidia's Kari Briski adds that enterprises will need to fine-tune or build their own AI to create agents with persistent identities through reinforcement learning, making them "become who they are rather than having the same memory as trained in." The second blocker is communication protocols—agents need their own TCP/IP moment, which "wasn't the finish line but the starting gun" for the internet, requiring shared technical and business language for agent-to-agent communication. The third is security, where Jensen Huang envisions inverting physical world ratios: instead of one security guard protecting thousands, we'll have "1000 security agents around one cognitive intelligence agent" because digital space has different economics than physical space.

Verification speed determines which AI use cases succeed in production

The reason coding agents dominate while 95% of pilots fail comes down to verification speed—code either compiles or doesn't, creating instant feedback loops that enable rapid iteration despite AI's stochastic nature, while domains like surgery have "extremely high" verification requirements that slow progress. Consider the verification hierarchy Buhler and Briski outline:

  • Code and math: Binary yes/no verification enables fastest progress. Medical scribing: Expert verification required but manageable in workflow

  • Specialized domains: Expensive experts needed, creating bottlenecks. Surgery decisions: Consequences too high for current AI verification

Briski reveals enterprises succeeding with "targeted use cases or specialization" rather than general AI, noting they want "the most accurate model on the tiniest cost of ownership" with their data staying private. The solution emerging is synthetic data generation and specialized "gyms" (reinforcement learning environments) for niche domains, where high-quality seed data creates verifiers without requiring expensive human experts. XPO demonstrated this by having their AI compete on HackerOne's real-world penetration testing leaderboard, becoming "number one hacker in the world" within months—proving academic benchmarks matter less than real-world verification.

Investment shifts from models to agents as specialization beats generalization

Buhler observes capital moving "from model space to up the stack to the agent space" because "at the end of the day it's all about people," with companies like ROX discovering that keeping humans in the loop increased email response rates 3x compared to full automation. Reflection AI's $2 billion raise with Nvidia participation signals open source's importance—enterprises need to "control the weights" and build their own solutions rather than rely on closed APIs, making them competitive with startups. The future isn't one model ruling all but "systems of models" working together, with specialization as the superpower: "There might be one model to start training, but specialized models end up doing the thing better." Buhler predicts world models as the next frontier—"extremely data intensive" systems processing "the whole space around us streaming in at all times," forming the base layer for robotics. Will enterprises adopt the startup mindset of "gradient descent algorithms" for rapid iteration, or remain trapped by legacy systems while specialized AI races ahead?