Software Engineer Blog

Shipping AI isn’t about bigger models. It’s about smaller mistakes.

Production notes from a builder who ships, operates, and occasionally breaks his own AI systems.

Read the notes Who’s writing this

Latest

Fresh off the press

asyncio.gatherasyncio Semaphoreconcurrent LLM callspython async await

Concurrent LLM Calls in Python: asyncio.gather + a Semaphore

500 LLM calls in a for-loop takes 17 minutes — and your CPU does nothing for all of it. await is not concurrency. Here's how asyncio.gather puts every request in flight at once, why it's named for the results and not the speed, and why asyncio.Semaphore is the one line that stops you from DDoS-ing yourself into a wall of 429s.

July 13, 2026Read the post

All posts

9 articles

Open source

Things I’ve open-sourced

Production-shaped code you can clone, read, and run — the same patterns I write about.

llm-gateway

OpenAI-compatible gateway over OpenAI, Anthropic & Gemini — one endpoint, cost/latency/token tracking, gateway-issued keys, streaming, retries + fallback.

FastAPILiteLLMSQLAlchemyPostgres

ai-engineering-series

Clone-and-run code for the "AI Engineering from Scratch" YouTube series — first LLM call, LiteLLM, your own gateway, and beyond.

PythonuvOpenAI SDKGemini

Vahid Aghajani — Applied ML Builder