Why I Built My AI Agent in Rust (And What I'd Do Differently)

I'm a growth marketing manager who got frustrated with how slow it is to get simple answers from GA4 and Search Console. So I built an AI agent to do it for me. In Rust. Despite not being a developer.

No LangChain. No LlamaIndex. Just Axum, SQLite, and a lot of stubbornness. Here's what I learned.

Why I Built This (The Frustration)

I'm a growth marketing manager, not a developer. But I kept running into the same frustrations:

Quick questions took forever. "Which pages lost clicks last week?" should take 10 seconds. In Search Console's UI, it takes 5 minutes of clicking through filters and waiting for reports to load.
GA4's interface is a maze. I just want to know if conversions dropped. Instead I'm configuring date comparisons, segments, and secondary dimensions. (This is a common problem—the data often doesn't even match between platforms.)
MCP OAuth is a nightmare. I tried setting up Claude with MCP servers for Google APIs. The "proper" way is to use service accounts—but good luck persuading your employer or clients to give you access to their Google Cloud Console. Most marketing teams don't even know what a service account is.

I wanted an AI agent that could answer "what happened to my organic traffic?" without me fighting the UI or begging IT for API credentials. That's what led me to build the SEO agent and GA4 agent that power Refresh Agent today.

The Decision: Python to Rust

I started building Refresh Agent in 2024 using Python, Django, and Pydantic AI. I'm not a developer by trade, but I managed to get v1 working mostly by hand. The prototype connected to GA4 and Search Console, ran queries through Claude, and returned useful insights.

Then I rewrote everything in Rust.

At the time, this was painful. AI coding assistants would fight the borrow checker endlessly, generating code that wouldn't compile, then "fixing" it by introducing new errors. I spent more time debugging AI-generated Rust than I would have spent writing Python by hand.

But I kept going. The agent handles OAuth tokens for multiple Google accounts and orchestrates calls across Claude, Grok, and Gemini. I wanted compile-time guarantees that my token refresh logic wouldn't silently fail at 3am.

Learning Rust Backwards

I've been learning Rust in a weird way: backwards. I have the Rust book open in one tab and Codex CLI in the other. The AI builds something, the compiler yells at it, and I read the relevant chapter to understand why it's yelling.

This sounds chaotic, but it works. The compiler errors are specific. When it says "cannot borrow as mutable because it is also borrowed as immutable," I go read about borrowing. When it complains about lifetimes, I read the lifetimes chapter. The AI handles the boilerplate; I learn the concepts as they become relevant.

The other thing that convinced me Rust was worth learning: No Boilerplate on YouTube. If you haven't seen it, the channel makes incredibly tight, well-edited videos that explain why Rust's constraints are actually features. Watching those videos is what gave me the confidence to try Rust in the first place.

I wouldn't recommend my backwards learning approach if you're trying to become a Rust expert. But if you're a marketer who just wants to ship something that works, it's surprisingly effective.

The Stack: Axum, SQLite, and Rig

The stack is deliberately boring:

Axum handles HTTP and SSE streaming for real-time agent responses
SQLite + SeaORM stores users, chat sessions, OAuth tokens, and proposal drafts
Rig abstracts across Claude, Grok, Gemini, and OpenAI with a single interface
Askama provides compile-time HTML templates

The interesting part is the LLM layer. Rig lets me define tools once and use them across any provider:

pub enum RuntimeAgent {
    #[cfg(feature = "backend-gemini")]
    Gemini(Agent<gemini::completion::CompletionModel>),
    #[cfg(feature = "backend-grok")]
    Grok(Agent<xai::completion::CompletionModel>),
    #[cfg(feature = "backend-claude")]
    Claude(Agent<anthropic::completion::CompletionModel>),
}

The agent routes requests to different providers based on task complexity. Quick prospecting tasks go to Grok (faster, cheaper). Agency proposal generation goes to Claude Sonnet (richer reasoning).

Tool Calling: Where Rust Shines

Every tool in the system implements a common trait. The trait defines input types, output types, and execution logic. When the LLM generates a function call, Serde deserializes it into a typed struct.

impl Tool for GscContentGapTool {
    const NAME: &'static str = "summarize_gsc_content_gaps";
    type Error = ToolError;
    type Args = GscContentGapArgs;
    type Output = GscContentGapSummary;

    async fn definition(&self, _prompt: String) -> ToolDefinition {
        ToolDefinition {
            name: Self::NAME.to_string(),
            description: "Summarise GSC-based content gaps...".to_string(),
            parameters: json!({
                "type": "object",
                "properties": {
                    "site": { "type": "string" },
                    "start": { "type": "string" },
                    "end": { "type": "string" },
                    "limit": { "type": "integer" }
                },
                "required": ["site", "start", "end", "limit"]
            }),
        }
    }

    async fn call(&self, args: Self::Args) -> Result<Self::Output, Self::Error> {
        // Fetch GSC data, filter for gaps, return typed summary
    }
}

If the JSON is malformed or missing required fields, I get a descriptive error immediately—not a mysterious KeyError twenty function calls deep.

The Numbers: Real Production Metrics

Metric	Value	Notes
Main Binary Size	34 MB	Release build, all LLM backends enabled
CLI Tool Binaries	16 MB each	outreach_cli, prospecting_worker
Full Rebuild Time	2m 17s	From scratch on M4 MacBook Air
Incremental Build	2-5 seconds	Day-to-day development
Memory at Runtime	~45 MB	Stable over days of operation
Cold Start	~12 ms	Binary to first request

The full rebuild time sounds painful, but it's misleading. During development, incremental builds take 2-5 seconds. The compiler only recompiles what changed. I touch a template? Seconds. Modify a tool? Seconds. The 2-minute rebuild only happens from scratch—which is rare.

What Worked: The Wins

1. Memory stability over time. RAM usage stays flat at ~45MB over days of operation. No GC pauses during long chat sessions. No mysterious OOM kills. This matters for features like continuous anomaly detection that need to run 24/7.

2. Type safety for OAuth flows. Google OAuth2 with token refresh is error-prone. The compile-time checks caught several token handling bugs that would have been silent runtime failures in Python.

3. Single-binary deployment. The Docker image is straightforward. No virtualenv, no requirements.txt, no "works on my machine." Just the binary and a SQLite file.

4. Async streaming with Axum SSE. Streaming LLM responses to the frontend is clean. The type system ensures I can't accidentally drop a connection or forget to flush.

What Hurt: The Pain Points

1. Compile times during prompt iteration. When tuning LLM prompts, you want fast feedback. Waiting 5 seconds to see if your new system prompt works is friction.

2. The LLM ecosystem is Python-first. LangChain, LlamaIndex, DSPy—they're all Python. The Rust ecosystem is catching up (Rig is excellent), but there are fewer ready-made integrations. I built more from scratch than I would have in Python.

3. Async Rust learning curve. Tokio is powerful but complex. Lifetime issues with async closures still trip me up occasionally. As a non-developer learning this stuff, the async concepts took a while to click.

The 2026 Reality: AI Writes Rust Now

Here's the counterintuitive insight: if I started this project today, I would go straight to Rust from day one.

In 2024, AI coding tools fought the borrow checker. They'd generate code, hit a lifetime error, "fix" it by adding random lifetime annotations, and spiral into increasingly broken code. I spent hours cleaning up AI-generated Rust.

In 2026, tools like Claude Code and Codex CLI write competent Rust. They understand ownership. They don't go in loops fighting the compiler. The gap between "AI can write Python" and "AI can write Rust" has largely closed.

I'm not the only one noticing this. Greg Brockman—co-founder of OpenAI and one of the people who built ChatGPT—posted this recently:

rust is a perfect language for agents, given that if it compiles it's ~correct
— Greg Brockman (@gdb) January 2, 2026

That's the insight I stumbled into backwards. The compiler catches the errors that would be runtime bugs in Python. For agents that need to run reliably without supervision, that matters.

The "Rust is too hard for AI agents" narrative is outdated. The ecosystem caught up.

Should You Build AI Agents in Rust?

Rust makes sense when:

You're building a long-running production agent (not a prototype)
You need multi-provider LLM orchestration with type-safe tool interfaces
You're managing complex state (OAuth tokens, sessions, user data)
Your team already knows Rust (or wants to learn)
You're a solo indie dev primarily using coding agents like Codex to build your app

Python is better when:

You're rapidly prototyping and need fast iteration
You're doing research/experimentation where the shape keeps changing
You need LangChain/LlamaIndex integrations out of the box
Your team doesn't know Rust and doesn't have time to learn

For production agents that run 24/7 and handle real user data, Rust's compile-time guarantees pay for themselves. The initial investment is higher, but the maintenance burden is lower. If you're curious about the difference between AI agents and traditional dashboards, I wrote more about that in AI Agents vs. Marketing Dashboards.

The Bottom Line

Python is easier to write. Rust is better to run. If you're a developer prototyping something you'll throw away, Python wins. If you're an indiehacker who wants something that just works without babysitting, Rust is worth the learning curve.

I'm still very much learning—both Rust and how to build AI agents properly. If you've built something similar, I'd love to hear what you've found. The code patterns here power Refresh Agent if you want to see it in action.