Artificial Intelligence

Andrej Karpathy: from vibe coding to agentic engineering

May 21, 2026

Written by Claude AI

Key insights:

Vibe coding lets anyone build prototypes fast, but agentic engineering adds the discipline needed to ship secure, production-grade software by coordinating powerful but unreliable AI agents without sacrificing quality.
LLMs have jagged intelligence, excelling in domains with verifiable RL rewards (math, code) while failing at common-sense tasks. Knowing which "circuits" you're operating in is a core skill for working effectively with these tools.
The human role shifts from writing code to owning the spec, design, and judgment calls. You can outsource syntax recall and implementation details to agents, but you cannot outsource architectural understanding or the ability to catch subtle logical errors.

The shift from vibe coding to something much bigger

Andrej Karpathy, co-founder of OpenAI and former head of AI at Tesla, recently sat down at Sequoia Capital's AI Ascent 2026 to discuss how programming has fundamentally changed. His message was clear: we are no longer just vibe coding. We are entering the era of agentic engineering, a serious discipline that demands new skills, new thinking, and a completely different relationship with code.

What makes this conversation so compelling is that Karpathy himself admits he has never felt more behind as a programmer. If someone at his level feels that way, it tells you something important about the pace of change.

Why does Andrej Karpathy feel more behind than ever as a programmer?

Karpathy described a clear inflection point in December 2025. He had been using agentic coding tools for a while. They were helpful but imperfect. Then something changed. The latest models started producing chunks of code that just worked. He kept asking for more. It kept coming out fine. He could not remember the last time he corrected the output.

That realization sent him down a rabbit hole. His side projects folder exploded. He was vibe coding constantly. But the key insight was not just that the tools got better. It was that the entire paradigm shifted. People who experienced AI as a ChatGPT-adjacent thing in 2024 needed to look again, because things changed fundamentally.

This is not a minor upgrade. Karpathy stressed that if you have not revisited these agentic tools since late 2025, you are working with outdated assumptions about what is possible.

What exactly is vibe coding and how did it start?

Karpathy coined the term vibe coding to describe a style of programming where you trust the AI to write code based on high-level instructions. You describe what you want. The model produces it. You do not manually review every line. You go with the vibe.

This approach raised the floor for everyone. People who had never written a line of code could suddenly build functional applications. Non-technical founders could prototype ideas. Students could ship projects in hours instead of weeks.

But vibe coding has limits. It works great for personal projects, prototypes, and experiments. It does not automatically produce secure, maintainable, production-grade software. That gap is exactly where agentic engineering comes in.

How is agentic engineering different from vibe coding?

Karpathy draws a clear line between the two. Vibe coding raises the floor. Agentic engineering preserves the quality bar. You are still responsible for your software. You cannot introduce vulnerabilities because you were vibing. You cannot ship bloated, insecure code just because an AI wrote it fast.

Agentic engineering is about coordinating powerful but imperfect agents to go faster without sacrificing quality. These agents are spiky entities. They are stochastic, sometimes unreliable, but extremely capable when directed well.

The people who master this discipline will see speed improvements far beyond the old 10x engineer benchmark. Karpathy believes the ceiling is much higher than 10x for those who invest in their setup, understand the tools deeply, and maintain strong engineering judgment.

Software 3.0 and the new computing paradigm

Karpathy has long talked about the progression from Software 1.0 (explicit code) to Software 2.0 (learned weights via neural networks). Now he describes Software 3.0, where programming means prompting, and the context window is your lever over an LLM that acts as an interpreter for digital information processing.

What does software 3.0 actually look like in practice?

Karpathy gave a striking example. When OpenClaw launched, the installation process was not a traditional bash script. Instead, you copy-paste a block of text into your agent. The agent reads your environment, makes intelligent decisions, debugs issues in a loop, and installs everything.

This is fundamentally different from writing a shell script that tries to account for every possible platform. The agent brings its own intelligence. It adapts. It problem-solves. The "program" is just the prompt you give it.

Another example was his Menu Gen project. He built an app that takes a photo of a restaurant menu, OCRs the items, generates images of each dish, and re-renders the menu with pictures. It worked. Then he saw someone do the same thing by simply giving the photo to Gemini with a one-line prompt. The entire app was unnecessary. The neural network did all the work.

Why should developers stop thinking in the old paradigm?

The Menu Gen story illustrates a critical point. Many developers are still thinking about AI as a way to speed up existing workflows. Build the same apps, just faster. But Software 3.0 is not about speed. It is about entirely new capabilities.

Karpathy's LLM knowledge bases project is a perfect example. You feed documents to an LLM and it creates a wiki, a structured knowledge base. This is not something that could have existed before. No traditional code could take unstructured documents and recompile them into an organized, queryable knowledge base. These are new things that were not possible before.

The opportunity is not just doing old things faster. It is building things that could not exist in the previous paradigm. That is where the real excitement lies.

Where is this heading in the next few years?

Karpathy painted a picture of a future where neural networks become the host process and traditional CPUs become co-processors. Intelligence compute from neural networks will dominate the total spend of compute cycles. Tool use, the deterministic tasks we rely on today, becomes a historical appendage.

He imagines devices that take raw video and audio, process them through neural nets, and render unique UIs in real time using diffusion models. No traditional app logic in between. Just neural networks doing the heavy lifting.

This sounds foreign. But Karpathy believes we will get there piece by piece. The progression is already underway.

Jagged intelligence and the ghost analogy

One of the most interesting ideas Karpathy has explored recently is the concept of jagged intelligence. LLMs are not uniformly capable. They excel in some domains and fail embarrassingly in others. Understanding why is essential for anyone building with these tools.

Why are LLMs so jagged in their capabilities?

Karpathy's favorite example: state-of-the-art models can refactor a 100,000-line codebase or find zero-day vulnerabilities. Yet the same model will tell you to walk 50 meters to a car wash instead of drive, because it does not understand you need to bring your car there.

This jaggedness comes from how models are trained. Frontier labs use giant reinforcement learning environments. Models get verification rewards for tasks that can be checked, like math and code. They peak in those verifiable domains. They stagnate in areas outside the RL circuits.

There is also a data distribution factor. Chess improved dramatically from GPT-3.5 to GPT-4 not because of general capability gains, but because someone at OpenAI added a huge amount of chess data to the pre-training set. The capability peaked because the data was there, not because the model got smarter overall.

What does verifiability mean for automation and careers?

Karpathy's framework is straightforward. Traditional computers automate what you can specify in code. LLMs automate what you can verify. If you can create RL environments with clear verification rewards, you can pull the lever and get strong performance.

This has massive implications. Domains that seem safe from automation might actually be highly verifiable. Even writing can be evaluated by a council of LLM judges. The question is not whether something can be automated, but how easy or hard it is to verify the output.

Karpathy hinted that there are very valuable RL environments that labs have not yet focused on. Founders who identify these gaps and build verification systems around them could create significant value. If you are in a verifiable setting where you can create RL environments, you can do your own fine-tuning and benefit enormously.

Should we think of LLMs as animals or ghosts?

Karpathy wrote about this distinction to help people build better mental models. LLMs are not animals. Yelling at them does not make them work harder. They do not have intrinsic motivation, curiosity, or empowerment. They are statistical simulation circuits shaped by pre-training data and RL.

He calls them ghosts: jagged, summoned entities that require a new kind of taste and judgment to direct. The substrate is statistics. RL bolts on top and extends the capabilities in uneven ways. Understanding this helps you set realistic expectations and avoid anthropomorphizing the tools you work with.

The practical takeaway: be suspicious. Explore the boundaries. Figure out which circuits you are in. If you are inside the RL distribution, things fly. If you are outside it, you will struggle. Knowing the difference is a core skill for agentic engineers.

What skills matter most in the age of agentic engineering?

As agents take on more of the actual coding work, the human role shifts. But it does not disappear. Karpathy is clear about what remains valuable and what you can safely hand off.

What can you outsource to agents and what must you keep?

Karpathy shared a tweet that stuck with him: "You can outsource your thinking but you can't outsource your understanding." This captures the current moment perfectly.

You can hand off API details. You do not need to remember whether it is keep_dims or keepdim, or whether the parameter is called dim or axis. Agents have excellent recall for these details. But you still need to understand that tensors have underlying storage, that views share memory, and that unnecessary copies hurt performance.

You are in charge of the spec, the design, the aesthetics, and the judgment calls. Agents fill in the blanks. You direct the work. This is not optional. Karpathy found that his agents made bizarre decisions, like matching Stripe email addresses to Google email addresses instead of using persistent user IDs. Without human oversight, these bugs ship to production.

How should companies change their hiring for agentic engineers?

Most companies have not updated their hiring processes. They still give coding puzzles. Karpathy argues this is the old paradigm. Instead, you should give candidates a big project. Build a Twitter clone. Make it secure. Deploy it. Then unleash 10 AI agents to try to break it.

The skills you are testing for are different now. Can this person coordinate agents effectively? Can they maintain quality at scale? Can they design systems that are robust even when built with AI assistance? These are the questions that matter.

Invest in your agentic tooling setup
Learn to write detailed specs and design documents
Develop taste for code quality, security, and architecture
Practice directing agents on large, complex projects
Stay curious about which domains are inside or outside the RL distribution

Is this the right time to learn automation and agentic engineering?

Absolutely. The demand for people who can work effectively with AI agents is growing faster than the supply. Companies need professionals who understand both the capabilities and limitations of these tools. If you want to be the person building and directing AI automation rather than being replaced by it, now is the time to invest in these skills.

The Complete RPA Bootcamp is designed for exactly this moment. You go from beginner to pro with Robotic Process Automation, agentic automation, and enterprise orchestration. It is a path to a future-proof career where you are the one building the automation, not watching it take your job.

For the full conversation between Andrej Karpathy and Stephanie Zhan, including his thoughts on agent-native infrastructure and the future of education, watch the embedded video below from the Sequoia Capital YouTube channel. It is one of the most insightful discussions on where software development is heading and what it means for your career.

Similiar blog posts

Artificial Intelligence

Clawdbot explained in 5 minutes (no hype)

Learn what Clawdbot (now OpenClaw) actually is, the 4 best use cases people are using it for, and the critical risks you must know before installing it.

Artificial Intelligence

How to switch from ChatGPT to Claude without losing memory

Learn why 1.5 million people switched from ChatGPT to Claude and follow a simple step by step guide to migrate your AI memory in under a minute.

Artificial Intelligence

Recursion is the next scaling law in AI

A 7-million parameter recursive model outperforms models 1000x its size. Learn how HRM and TRM papers use recursion at inference time to break through reasoning limits.