Honest takes on building software, shipping products, and the realities of the tech industry.
Vercel BotID In 2026: How The Invisible CAPTCHA Actually Works, And Where It Earns Its Place In My Stack Vercel BotID went GA in mid-2025 and quietly replaced the visible CAPTCHA on a lot of indie SaaS sites in 2026. The promise is real: invisible bot detection that catches headless Playwright sessions without making your real users squint at fire hydrants. The price is real too. Here is what BotID actually does under the hood, the Basic versus Deep Analysis tradeoff, the route patterns I protect with it, and the day a single AI scraper convinced me to wire it in front of an endpoint I thought was already safe.
Vercel Zero: The Programming Language Built So AI Agents Can Read, Repair, And Ship Native Code Vercel Labs just dropped a new systems language called Zero whose compiler speaks JSON, whose effects live in your function signatures, and whose binaries weigh less than ten kilobytes. The pitch is simple: a language where an AI agent can read a compiler error, ask for a typed fix, and ship a native program without a human in the loop. Here is what Zero actually is, what it is not, and whether the agent-first compiler is a clever bet or a Vercel side project you can safely ignore.
Background Jobs For Indie Developers in 2026: When You Need A Queue, When You Do Not, And What I Actually Use Every job queue tutorial is written for companies running ten thousand jobs a second. As a solo developer you do not need Sidekiq Pro and a Kubernetes cluster to send a welcome email. Here is the actual background job setup that earns its place for indie projects in 2026, and the day a Stripe webhook taught me why setTimeout was never going to be enough.
Claude's June 15 Pricing Split: What Indie Devs Actually Need to Do Before the Meter Starts On June 15, 2026 Anthropic splits Claude subscriptions into two pools. Interactive chat stays the same. Anything programmatic (Agent SDK, claude -p, Claude Code GitHub Actions) gets metered in dollars at full API rates. Here is what that actually costs, who wins, who loses, and exactly what to change in your setup before the meter flips on.
Zero-Downtime Postgres Migrations: The Mistakes That Locked My Production Database A single ALTER TABLE on a 40 million row table can freeze your app for forty minutes. Most migration tutorials skip the part where the database is also serving live traffic. Here is what shipping schema changes to a real production Postgres in 2026 actually looks like, including the operations I now refuse to run during business hours.
Server-Sent Events vs WebSockets in 2026: When Each One Actually Wins WebSockets get reached for by reflex. Half the time the right answer is the boring one nobody talks about: Server-Sent Events. Here is the actual decision framework for real-time features in 2026, and the cost both choices hide from you.
Stripe Webhooks in Production: Idempotency, Retries, and the Mistakes That Cost Me Real Money Stripe webhooks look like a five-minute integration in the docs. Then a customer is double-charged, a subscription event arrives out of order, your handler 500s for an hour, and Stripe quietly retries the same event 47 times. Here is what shipping webhooks to real billing flows actually looks like in 2026.
Passkeys in Production: What I Wish I Knew Before Replacing Passwords Passkeys look simple in the WebAuthn demo. They get strange the moment you handle a user with two laptops, a stolen phone, a Bitwarden subscription, and a corporate device that blocks iCloud Keychain. Here is what shipping passkeys to real users actually looks like in 2026.
TypeScript at Scale: Why Your tsc Takes 90 Seconds and How to Fix It Your build is slow. Your editor lags when you hover a type. CI spends more time type-checking than running tests. None of this is unavoidable. Most of the cost is a small number of patterns that are easy to write and expensive to compile. Here is how to find them and what to do.
Anthropic and SpaceX: What the Colossus Deal Actually Means for Developers Claude Code rate limits doubled overnight. The reason is a 220,000 GPU data center in Memphis that SpaceX built and Anthropic just rented, from the same Elon Musk who was calling Anthropic evil three months ago. Here is what this deal actually means for developers building with Claude in 2026.
JavaScript Async Lifetimes: The Leak You Have and Probably Do Not Know About Promise.all does not cancel sibling tasks when one fails. Your async code is likely leaking database connections, keeping fetches alive after unmount, and holding ports open through process exits. ES2026 finally gives you the primitives to fix this without a library. Here is how.
Embedding Models And Reranking In Production 2026: Picking The Pair That Actually Lifts Retrieval Quality The embedding model decides what your retriever can find. The reranker decides what makes it to the LLM. By 2026 the production patterns for picking and pairing these two have stabilized, and most teams are still leaving real recall on the table because they treat embeddings as a commodity and skip reranking entirely. Here is what actually works, and what to stop doing.
RAG Chunking Strategies In Production 2026: What Actually Survives Real Documents And Real Queries Most RAG systems do not fail at the LLM. They fail at the chunker. By 2026 the patterns for splitting documents into retrievable units have matured into a small set of choices that consistently outperform the default 512-token slicer everybody starts with. Here is what those choices are, where each one breaks, and how to pick the right one without rebuilding the index every Friday.
AI Guardrails And Output Validation In Production 2026: What Actually Catches Bad Outputs Before Users Do Most teams discover their guardrails are missing the moment a screenshot of their AI saying something stupid hits the timeline. By 2026 the patterns for catching bad LLM outputs before they ship to users have settled into something concrete: layered validators, fast cheap checks first, expensive ones only when needed, and a clear policy for what to do when validation fails. Here is what that looks like in real systems.
Small Language Models In Production 2026: Where SLMs Beat Frontier Models, And Where They Quietly Fail The 8B-parameter model that runs on a single GPU is good enough for more of your pipeline than you think, and worse than you think for the parts you keep wanting to give it. By 2026 the production patterns for using small language models alongside frontier ones have settled into a clear shape: route by task, not by vibe, and stop paying for capabilities you are not using. Here is how that actually plays out.
Designing Tools For AI Agents In 2026: Schemas, Descriptions, And The Pitfalls That Make LLMs Fail Silently The bug in most agents is not the model. It is the tools you handed it. Vague descriptions, overlapping responsibilities, and schemas that look fine on paper produce agents that confidently call the wrong function with the wrong arguments. Here is how to design tools the model can actually use, drawn from the production patterns that have stabilized by 2026.
Multi-Modal AI Agents In Production: Vision, Audio, And The Glue That Actually Works In 2026 Shipping a multi-modal agent is not the same as adding an image input to your chat. The teams running real vision and audio agents in production by 2026 have discovered the same set of sharp edges: tokenization surprises, latency that explodes on the second modality, evaluation that needs new shapes, and cost curves that look nothing like text. Here is what that actually looks like once it is in front of users.
AI Agent Reliability Engineering in 2026: SLOs, Error Budgets, And Failure Modes That Actually Matter Treating an AI agent like a normal service is how you get a 95 percent uptime number that hides a 60 percent task success rate. The teams running real agent products in 2026 measure reliability differently, set SLOs on outcomes instead of HTTP codes, and have rehearsed every failure mode the agent introduces. Here is what that looks like.