Coding at Agent-Speed

Recently, I've been using Code Agents intensively to build a security suite—an enterprise-grade LLM Security Insight platform based on eBPF and PostgreSQL. The tech stack involves eBPF, React, Go, Docker, Kubernetes, PostgreSQL, and DuckDB. While I had experience with some of these technologies, others like React, eBPF, and DuckDB were completely new to me, having not touched them in my previous AI Infra work. In this first post about my Code Agent experience, I'll share more about my feelings and extended thoughts. Future posts might just briefly mention changes.

VS Code -> Cursor -> Gemini -> VS Code -> Codex -> VS Code -> Claude Code -> OpenCode -> OhMyOpenCode -> VS Code

I'm a long-time VS Code user. Although I haven't researched IDEs deeply, I find it quite usable. The first Code Agent I used was VS Code Copilot, but its initial version was very simple: specify the file to change, describe how you want to change it, Copilot generates a code patch/diff, and VS Code applies the patch to your file. At this stage, the Code Agent couldn't really be called an Agent; it was at most a Code Assistant, a form of pair programming. Initially, it felt natural because I was used to code completion and pair programming. In familiar domains, it felt comfortable, like this is how programming should be.

However, around March or April 2025, I suddenly encountered the Cursor IDE. Of course, Cursor is forked from a version of VS Code, but details made me feel it was different—a different product. For example, multi-file editing and Agent mode. Initially, many people scoffed at Agent mode and multi-file editing because users were accustomed to pair programming, and multi-file editing increased review time. Also, constrained by model capabilities, code generation wasn't reliable enough to approve without review. But Cursor dared to be the first (or at least the first I knew of). With improvements in LLM capabilities, especially Claude's Sonnet series optimized for code, Cursor went wild. I recall a small incident in January 2025 where Cursor co-founder Sualeh Asif emailed our team engineer keming to discuss our reproducible dev environment infra tool envd. It shows Cursor had already foreseen consistency issues in large-scale agent runtimes. Later, we learned from Sualeh that they used vector databases and Turbopuffer for code search to adapt to multi-tenant scenarios. However, I didn't use Cursor for long because after its user base exploded and free quotas were slashed, I didn't have a suitable credit card to subscribe, haha.

cursor-email

Then came Gemini CLI. Thanks to Google's massive resources and powerful Infra, they offered free and powerful Gemini tokens, attracting many new users, myself included. But at the time, I was responsible for an existing business, VectorChord Cloud, so I mostly used Gemini to fix small issues like corner cases or writing test files. Since I knew the project code well, I—as the human context—worked well. But up to this point, I still didn't feel the “Agent” vibe; it was still pair programming.

However, Gemini CLI's quota eventually made it hard to continue, so I moved back to VS Code. GitHub gave me a free Copilot subscription, and I was surprised to find VS Code also had an Agent mode supporting models from different vendors, like OpenAI's GPT series, Claude's Sonnet series, and Google's Gemini series. The Enterprise subscription quota was sufficient. But I mostly used multi-file editing rather than auto-approve, ensuring I understood every line of modified code.

During this period, I mostly used agents to implement simple tools, scripts, or code fixes. It wasn't until around August, when we explored some security directions, that I tentatively dared to let an Agent implement complete project-level code.

I forget which day it was, but Claude Code suddenly became the talk of the town among friends. I switched to Claude Code but found Claude's region restrictions too severe to use. I had to obtain Claude model tokens through special means. Later, after GLM came out, people reported strong coding and tool-calling capabilities, so I got a GLM subscription and smoothly started using Claude Code. On Claude Code, I truly felt the Agent experience: one factor was the leap in model capability leading to high-quality code and design docs; the other was the smooth transition between Plan, Auto, and Edit modes in Claude Code, allowing me to achieve goals with planning and rhythm. For the project mentioned at the beginning, I basically completed it using the Agent approach, roughly following these steps:

Maintain a Context folder containing all relevant research docs, design docs, and tool code.
Let the Agent fine-tune design docs with me, determining tech stack, architecture design, etc.
Let the Agent write simple verification code to validate feasibility.
Let the Agent implement the full code.
After the initial code is done, use Git Worktree to let the Agent fix bugs and improve features, though there are limitations:
- Database field changes can't be done this way because I run PostgreSQL in Docker locally and can't support concurrent schema changes.
- I'm actually a bit uneasy about Git, fearing accidental deletion of tuned code, but this mental burden is decreasing as Agent code generation quality improves and it retains memory of implemented code.

1
2
3
4
5
6


$  ls ~/llm-observability/context
agent-protocol-research.md  analyze.sql  llm_prompt_injection_analyzer_pg.py  llm_semantic_analyzer_pg.py  pencil-designs  security-product-research.md  why-infer-root-exec-id.md
analyze-pg.sql              designs      llm_prompt_injection_analyzer.py     llm_semantic_analyzer.py     __pycache__     watchu-desktop
$  ls ~/llm-observability/context/designs
0.3.0.md          analyze.md       claude-code-json.md  frontend.md     mcp.md                skill-security-saas-mvp.md  streaming-ingestion-analytics-service.md
analyze-0.2.0.md  architecture.md  codex-json.md        gemini-json.md  openai-compatible.md  skills-security-design.md

Later, I saw good reputation for OpenCode on Twitter, so I tried it briefly—it was not bad. Then the OhMyOpenCode plugin went viral on X, and I tried it too. I found OhMyOpenCode is great if you have models from many vendors. This is also a feeling I have after using Agents: distinctly, in terms of aesthetics, GPT-5.2-Codex > Gemini-3 Pro > Claude Opus 4.5 » GLM 4.7; in Planning capability, Claude Opus 4.5 is leagues ahead; for Coding, Claude Sonnet 4.5 and GLM 4.7 are good choices; for Debugging, Claude Opus/Sonnet 4.5 and GLM 4.7 are both fine. Complementary strengths of different models in different areas help complete work better. But the problem is models iterate, meaning I have to pay attention to both model updates and plugin updates, plus manage multiple subscriptions, which makes me feel a bit overwhelmed.

Finally, since I always work via VS Code Remote to my dev machine, I haven't used Agent Orchestration tools like Conductor much. Also, I often need to feed images to the Agent, and my favorite tools like Claude Code and Codex now have official VS Code extensions, so I eventually returned to VS Code for work.

vscode

As for Global Context, I currently follow Xuanwo‘s practice, adding tricks I discover in daily work. I'll share more on this later.

Thoughts

I've briefly written about my Code Agent experience over the past year, leading to the following thoughts:

Fork capability is a must-have:
- Code and File Level: Solved by Git-like tools.
- Runtime Environment Level: Solved for small-scale personal users by Docker/envd containers or virtualization tools, but there's no good solution yet for large-scale runtime scheduling. Unikernel might be a good direction, but current ones (like Firecracker) aren't sufficient for large-scale Agent runtime needs.
- Database Level: Database ACID properties make forking inherently difficult, especially for large distributed databases. Forking inevitably involves redundant data and consistency issues. This is an advantage of monolithic databases like PostgreSQL. You can easily achieve database forks using tools like piglet via PG's PITR (Point-in-Time Recovery) and JuiceFS Snapshot.
Multi-model collaboration is the future trend: Different models have different strengths. Future Code Agent tools need to support seamless switching and collaboration between multiple models.
Leverage Skill Best Practices and MCP connectivity to boost overall productivity.
Context (or Memory) evolution and sharing is a key direction for future Code Agents.
The future is here, All in AI.

Next Step

I plan to further optimize my Code Agent experience, mainly focusing on:

Configuring a higher-spec Mac as a dev machine.
Automating everything: DNS config, domain registration, Agent notifications, etc.
Exploring better Global Context and Memory management solutions.
Researching Infra solutions for large-scale Agent runtimes.