Modern Web App Architecture: Lessons from Building 10+ AI Tools
Gizem Bastas · Founder, Bastas Design
7 min readAfter building and deploying over ten AI-powered web applications, we share our architecture insights — from choosing the right stack to managing subdomain-based microservices and keeping everything maintainable.
We have shipped more than ten AI-powered web applications in the last two years — translation tools, productivity apps, learning platforms, creative utilities. Each one taught us something about modern web architecture that we would not have learned from reading blog posts. This is a summary of the ideas that survived contact with production.
Pick a boring stack for the shell
Every AI app has an interesting part — the model, the data pipeline, the unique feature — and a boring part — auth, routing, deployment, SSL, logging. Spend your novelty budget on the interesting part. Use boring, battle-tested tools for the shell.
Our boring stack: React with Vite, Tailwind for styling, React Router for navigation, and a small Express backend when we need one. Boring does not mean old. It means well-understood, well-documented, and easy to hire for. You should be able to onboard a new engineer to the shell in a day.
Subdomains over monorepos for separate products
A common instinct is to put every product in a single codebase. We did this briefly and regretted it. Different products have different release cadences, different performance profiles, different failure modes. A deployment for one should not risk the others.
Subdomain-based deployment gives each tool technical independence while letting them share brand, auth, and design tokens through a small shared package. One tool going down does not take the rest with it. One tool needing a heavy library does not bloat the bundle for the others.
Keep state on the edge when possible
Modern web apps accumulate state fast: user preferences, draft documents, cached translations, session history. The default instinct is to push all of this to a central database. Resist that instinct when you can.
IndexedDB, localStorage, and service workers have matured to the point where many apps can keep most state on the user's device. This has three benefits: it is faster, it is more private, and it scales for free. You only need to sync to a central database when the user wants cross-device continuity, and that can often be opt-in.
AI calls are network calls — treat them as such
Calling a language model looks like a function call. It is not. It is a network call with a tail latency distribution, a rate limit, a cost per token, and a non-zero failure rate. Every place in your app that calls a model should be treated with the same discipline as any other network call: loading states, error boundaries, retries with backoff, user-visible cost indicators where relevant.
We also wrap every AI call in a small abstraction that logs the prompt, the response, and the latency. When something goes wrong in production, this log is the first thing we read. It has paid for itself many times over.
Design tokens pay off immediately
Across ten products, consistent design language is a moat. Consistent design language comes from design tokens — colors, spacing, typography, radius, shadow — defined once and consumed everywhere. Tailwind makes this straightforward if you commit to it.
The payoff is not just aesthetic. When you want to ship a dark mode, you change tokens. When you want to re-theme a tool for an event, you change tokens. When a new designer joins, they learn one vocabulary, not ten.
Ship before you are ready, iterate with real users
Every tool in our ecosystem is better than the version we originally planned to ship. None of them would have gotten there without real users complaining about real problems. Architecture matters, but it matters in service of shipping — not as an alternative to it.
Our rule: if a tool solves a real problem for one person other than the author, it is ready to deploy. The polish comes from the next twenty versions.