Artificial Intelligence

How to make Claude Code your AI engineering team with GStack

June 7, 2026

Written by Claude AI

Key insights:

GStack turns Claude Code into a full engineering team with built-in skills for office hours, design review, code review, QA, and browser testing, replacing heavy frameworks with thin scaffolding and focused prompts.
Adversarial review catches design flaws before coding starts, pressure-testing ideas with hard questions about user demand, failure modes, and competitive wedges that filter weak concepts early.
Parallel agents are the real speed unlock, running 10 to 15 Claude Code sessions at once through Conductor shifts your role from typing code to orchestrating a software factory.

Why AI coding feels completely different now

You are living through a shift in how software gets built. The era of typing every line of code by hand is fading fast. In its place, an agent era is emerging, where you direct AI engineers like a team lead instead of grinding through syntax.

Garry Tan, the president and CEO of Y Combinator, recently shared how he has coded more in the past two months than he did in all of 2013. That is a wild claim from someone who was employee number 10 at Palantir and co-founded Posterous. The reason is simple. The models are finally smart enough, and the right scaffolding lets them do real work.

This is the moment to learn how to build with agents. If you want a structured path from beginner to pro, the Complete RPA Bootcamp teaches you Robotic Process Automation, Agentic Automation, Coded Automation, and Computer-Use Agents so you can become the one building AI instead of being replaced by it.

What is GStack and why does it matter?

GStack is an open-source toolkit built by Garry Tan that turns Claude Code into an AI engineering team. It launched three weeks ago and already has more GitHub stars than Ruby on Rails.

The idea is a thin harness with fat skills. Instead of building heavy frameworks around the model, you give it sharp, focused skills. Office hours, design review, code review, QA, and browser testing all come built in.

You can use GStack with Claude Code, Codex, or Cursor. It is free and open source.

Why does the model wander without structure?

Out of the box, a coding model guesses. It does not know your codebase well, so it produces plausible code that silently breaks. The bottleneck is not intelligence. The bottleneck is direction.

Humans solve real engineering problems with roles, process, and review. Why would AI agents be any different? GStack encodes that team structure into prompts and skills the model can call on.

Once you set the model up correctly, it can do extraordinary work. The trick is making the scaffolding trivially thin so the model has room to think.

How is this different from just prompting Claude?

If you just type your idea into Claude, it will do exactly what you asked. It will not pressure-test the idea. It will not ask who the user is. It will not consider the business model or the failure modes.

GStack flips that dynamic. The office hours skill forces six questions that reframe your product before you write a line of code. The design skills generate multiple visual options. The review skills hunt for bugs at staff engineer level.

You go from a half-baked idea to a working app in a single session.

Building an app live with GStack office hours

Garry walked through a real example in the video. He wanted an app to pull all his 1099 tax forms out of Gmail and from bank portals. Tax day pain, real and personal.

Watch how the idea evolved through office hours, adversarial review, design shotgun, and parallel agents. It went from a checklist tool to a much bigger business.

What question kills most startup ideas?

The first thing office hours asks is the strongest evidence that someone actually wants this. That single question filters out most weak ideas before any code gets written.

Garry answered honestly. He has more than five bank accounts. His accountant nags him every year. The pain is real, but the consequence is friction, not penalties.

The model pushed back. TurboTax and HR Block already import 1099s. Plaid connects to banks. Why are those not solving the problem? Good question. The answer reframed the whole idea.

The model spotted a wedge strategy. Hook users with 1099 aggregation, then expand into matching them with tax preparers. That is a 10x bigger business than charging two dollars a month for document collection.

How does adversarial review catch problems early?

After office hours produces a design doc, GStack runs a multi-step adversarial review. The model tries to break its own plan.

In the demo, the review found 16 issues. No failure handling. No privacy section. A two-factor authentication handoff with no solution. The model auto-fixed what it could and flagged the rest.

The score went from 6 out of 10 to 8 out of 10. Three issues remained for later. That is exactly how a senior engineer would review a junior's design doc.

Doing this work upfront saves hours of debugging later. You catch the hard questions before they become production fires.

What is the design shotgun skill?

Once the plan is locked in, design shotgun generates multiple visual mockups. It farms the work out to OpenAI Codex with image generation.

In the demo, three versions came back in about five minutes:

Option A: A command center style dashboard, dense and technical
Option B: A friendly card-based layout with progress rings
Option C: A complex split view that overcomplicated things

Garry picked option B because it felt more approachable for normal users. If you do not like any of them, you give feedback and regenerate. The model learns what you want without you needing to open Figma.

Running multiple AI engineers in parallel

This is where things get interesting. Once a single agent can handle planning, design, and coding, the next bottleneck becomes you. You become the reviewer, the QA tester, the merger.

The fix is parallelism. Run many agents at once on different branches.

How does conductor enable parallel agents?

Conductor is the tool Garry uses to spin up parallel Claude Code sessions. Each one runs in its own work tree on its own branch.

You can have three or four sessions on the same project, each tackling a different feature. You can also have sessions across multiple projects, all running at the same time.

GStack is built directly into Conductor. You click the GStack button in quick start and you are ready to go.

This is what shipping 10x faster actually looks like. It is not the model getting faster. It is you running ten models at once.

What does a software factory at level 7 look like?

Garry talks about levels of AI coding maturity. Level 8 would be full autonomy. He says GStack gets you to level 7.

At level 7, you run 10 to 15 parallel Claude Code sessions. One might be doing office hours on a new idea. Another might be writing code. Another might be running QA. Another might be reviewing a community pull request.

Garry currently has about 400 open pull requests across his projects. He evaluates them in waves. Without parallel agents, that would be impossible.

This is the new shape of engineering work. You are not typing code. You are orchestrating agents.

How does the QA browser tool work?

QA was the last bottleneck. Even with agents writing code, Garry was stuck testing everything by hand. The least fun part of software development.

He tried Claude in Chrome MCP and called it one of the worst pieces of software he had ever used. Slow, context-heavy, often broken.

So he wrapped Playwright at the CLI level and built /qa and /browse tools. Now any agent can drive a real browser. It can take screenshots, click buttons, fill forms, download files, and run regression tests.

Combine that with the ship skill, which checks that a pull request is ready to merge, and you have a closed loop. Plan, design, code, test, ship. All automated. All in parallel.

The skills that make Claude Code a real team

GStack ships with more than 28 commands. Each one represents a role on the engineering team. Together they replace what used to take a co-founder and ten engineers.

What are the core skills you should know?

Here are the skills that get used most often:

Office hours: Pressure-tests your idea with six forcing questions modeled after YC partner sessions
Auto plan: Runs CEO, engineering, design, and developer experience reviews with sensible defaults
Design shotgun: Generates multiple visual mockups in parallel using image generation
Review: Staff-level code review that catches bugs the plan missed
QA and browse: Wraps Playwright so agents can drive a real browser for testing
Ship: Final checks before a pull request lands on main

Users report spending 80 to 90 percent of their time in office hours, plan, CEO review, and auto plan. The thinking phase is where most value gets created.

When should you bring in Codex instead of Claude?

Garry has a memorable description of when to switch models. Claude Opus is the ADHD CEO. Smart, creative, full of ideas, the guy you want to grab a beer with.

Codex is the autistic CTO. When the going gets tough and you need to grind through a hard bug, you call Codex. It will sit there and methodically work the problem.

GStack lets you swap between them. Use Claude for ideation and planning. Use Codex for deep debugging and complex refactors. The right tool for the right job.

How can you start using GStack today?

Getting started is straightforward:

Install Claude Code, Codex, or Cursor
Clone the GStack repository from GitHub
Optionally install Conductor for parallel sessions
Run /officehours on your next idea

You will see the model think out loud. You will see it push back on weak assumptions. You will see it propose three approaches when you expected one. That is the team experience.

If you want to take this further and build a career around automation and AI agents, the Complete RPA Bootcamp walks you from zero to building production automations. You learn RPA, Agentic Automation, Coded Automation, and Computer-Use Agents. By the end, you are the one building the systems that replace manual work, not the one being replaced.

The barrier to building software just collapsed. The only question left is what you are going to build. Watch the full demo embedded below from the Y Combinator YouTube channel to see Garry walk through GStack live, and then go make something people want.

Similiar blog posts

Artificial Intelligence

How to build an internal AI agent that evolves itself

Learn how AnswerThis built a self-extending internal AI ops agent that handles emails, support tickets and CRM updates, helping them scale to $2M ARR with two people.

Artificial Intelligence

Build a Proactive Agent Workflow with Claude Code

Learn how to build proactive agent workflows using Claude Code routines. Automate tasks, trigger sessions on events, and turn Claude from a tool into a true teammate.

Artificial Intelligence

Demis Hassabis: We're Three Quarters of the Way to AGI

Demis Hassabis shares why AGI is achievable by 2030, how AI will collapse drug discovery timelines, and why information may be the universe's most fundamental substance.