Choosing the Right Architecture for Your SaaS

When a client asks us to build a SaaS platform, the first question is never "what framework should we use?" but rather "how does this need to scale in 18 months?". The architecture you choose on day one defines the ceiling of your product — and more importantly, it defines how fast your team can iterate during the critical first year.

We've built over 15 SaaS platforms across industries ranging from event management to fintech to logistics. Every single one started with the same conversation: what are we optimizing for? Speed to market? Long-term scalability? Operational cost? The answer is never "all three equally," and pretending otherwise is how projects fail.

The three realistic options

In practice, there are three proven paths for a SaaS in 2026. Each one has a sweet spot, and each one has failure modes that we've seen firsthand.

1. Well-structured monolith

The monolith is back. After years of "microservices for everything," the industry has realized that a well-organized monolith solves 80% of use cases. Companies like Shopify, Basecamp, and even parts of GitHub still run on monolithic architectures — and they serve millions of users.

The key word here is well-structured. A monolith doesn't mean a single tangled codebase where every module reaches into every other module's database tables. It means a single deployable unit with clear internal boundaries.

When to choose it: MVP, teams of 1-5 devs, aggressive time-to-market, products where the domain is still being discovered
Typical stack: Next.js full-stack, Rails, Laravel, Django
Limitations: Vertical scaling has a ceiling, coupled deployments, one slow module can slow down everything

// A monolith doesn't mean messy code.
// Modular structure from day 1:
src/
  modules/
    auth/        // Login, registration, permissions
      routes.ts
      service.ts
      repository.ts
      auth.test.ts
    billing/     // Subscriptions, payments
      routes.ts
      service.ts
      stripe.adapter.ts
      billing.test.ts
    events/      // Core business logic
      routes.ts
      service.ts
      repository.ts
      events.test.ts
    analytics/   // Metrics and dashboards
      routes.ts
      service.ts
      clickhouse.adapter.ts
  shared/
    db/          // Models and migrations
    queue/       // Async jobs
    middleware/  // Auth, rate limiting, logging
    errors/     // Custom error types

The critical practice is enforcing module boundaries. Each module exposes a public API (its service layer) and no module directly accesses another module's database tables or internal functions. This is the discipline that makes a monolith extractable later.

// Good: Module communicates through the service layer
import { BillingService } from '@/modules/billing/service';

class EventService {
  constructor(private billing: BillingService) {}

  async createEvent(data: CreateEventDTO) {
    // Check billing limits through the public API
    const plan = await this.billing.getCurrentPlan(data.organizerId);
    if (plan.eventsRemaining <= 0) {
      throw new PlanLimitExceeded('events');
    }
    return this.repository.create(data);
  }
}

// Bad: Module reaches directly into another module's database
class EventService {
  async createEvent(data: CreateEventDTO) {
    // DON'T DO THIS — direct DB access across module boundaries
    const plan = await db.query('SELECT * FROM billing_plans WHERE user_id = ?', [data.organizerId]);
  }
}

When a monolith starts to hurt: You'll know it's time to consider extraction when deployments become risky (a change in billing breaks events), when build times exceed 10 minutes, or when two teams are constantly creating merge conflicts in the same files. These are organizational signals, not technical ones.

2. Microservices

Splitting your system into independent services that communicate with each other. Each service has its own database, deployment pipeline, and lifecycle. In theory, this gives you independent scalability and team autonomy. In practice, it gives you a distributed systems problem.

When to choose it: Teams of 10+, clearly separated domains, need to scale parts independently, regulatory requirements that mandate data isolation
Overhead: You need serious observability (tracing, distributed logging), deployment orchestration, contract management between services, and a platform team to maintain the infrastructure
Risk: If you split too early, you end up with a "distributed monolith" — the worst of both worlds

Microservices are an organizational solution, not a technical one. If you don't have the organizational problem, you don't need the solution.

The hidden cost of microservices is operational complexity. Every service needs its own CI/CD pipeline, health checks, monitoring dashboards, and on-call runbooks. A system with 8 microservices doesn't have 8x the complexity — it's closer to 8-squared because you need to reason about every possible interaction between services.

// Service-to-service communication: synchronous (HTTP/gRPC)
// Use when you need an immediate response
class OrderService {
  async createOrder(data: CreateOrderDTO) {
    // Synchronous call to inventory service
    const available = await this.inventoryClient.checkAvailability({
      productId: data.productId,
      quantity: data.quantity,
    });

    if (!available) {
      throw new InsufficientInventory(data.productId);
    }

    const order = await this.repository.create(data);

    // Async event for downstream consumers (email, analytics, etc.)
    await this.eventBus.publish('order.created', {
      orderId: order.id,
      customerId: data.customerId,
      total: order.total,
    });

    return order;
  }
}

// Service-to-service communication: asynchronous (events)
// Use when downstream processing can happen later
class InventoryConsumer {
  @OnEvent('order.created')
  async handleOrderCreated(event: OrderCreatedEvent) {
    await this.repository.reserveStock({
      productId: event.productId,
      quantity: event.quantity,
      orderId: event.orderId,
    });

    await this.eventBus.publish('inventory.reserved', {
      orderId: event.orderId,
      reservedAt: new Date(),
    });
  }
}

The distributed monolith anti-pattern: This is the most common failure mode we see. Signs include: services that must be deployed together, shared databases between services, synchronous chains of 4+ service calls to complete a single user action, and teams that need to coordinate every sprint. If you see these patterns, you haven't actually achieved microservices — you've just made your monolith harder to debug.

3. Serverless / Event-driven

Functions that execute on demand, connected by events. AWS Lambda, API Gateway, EventBridge, DynamoDB. This is the architecture of "pay only for what you use" — and it's genuinely transformative for the right workloads.

When to choose it: Variable loads (traffic spikes), event processing, third-party integrations, background processing pipelines, startups optimizing for cost
Benefit: Scales automatically from 0 to thousands of concurrent executions, you only pay for actual usage, zero server management
Limitations: Cold starts (though these have improved dramatically), complex local debugging, vendor lock-in, 15-minute execution limit per function

// AWS Lambda handler for processing a new subscription
import { EventBridgeEvent } from 'aws-lambda';
import { DynamoDBClient, PutItemCommand } from '@aws-sdk/client-dynamodb';
import { SESClient, SendEmailCommand } from '@aws-sdk/client-ses';

const db = new DynamoDBClient({});
const ses = new SESClient({});

export const handler = async (event: EventBridgeEvent<'subscription.created', SubscriptionPayload>) => {
  const { customerId, planId, startDate } = event.detail;

  // Persist to DynamoDB
  await db.send(new PutItemCommand({
    TableName: process.env.SUBSCRIPTIONS_TABLE,
    Item: {
      PK: { S: `CUSTOMER#${customerId}` },
      SK: { S: `SUB#${startDate}` },
      planId: { S: planId },
      status: { S: 'active' },
      createdAt: { S: new Date().toISOString() },
    },
  }));

  // Send welcome email
  await ses.send(new SendEmailCommand({
    Destination: { ToAddresses: [event.detail.email] },
    Source: 'welcome@example.com',
    Message: {
      Subject: { Data: 'Welcome to the platform!' },
      Body: { Html: { Data: generateWelcomeEmail(event.detail) } },
    },
  }));

  return { statusCode: 200 };
};

Cold starts in 2026: AWS has made significant improvements. With provisioned concurrency, SnapStart (for Java), and the latest Lambda runtime optimizations, cold starts for Node.js functions are typically under 200ms. For most SaaS use cases, this is imperceptible. But if you're building a real-time trading platform where every millisecond counts, serverless might not be your first choice.

Architecture comparison at a glance

| Factor | Monolith | Microservices | Serverless | |--------|----------|---------------|------------| | Time to MVP | 2-4 weeks | 6-12 weeks | 3-6 weeks | | Team size | 1-8 devs | 8-50+ devs | 2-10 devs | | Operational cost (early) | Low ($50-200/mo) | High ($500-2000/mo) | Very low ($5-50/mo) | | Operational cost (scale) | Medium | Medium-High | Variable (can spike) | | Debugging complexity | Low | High | Medium-High | | Deployment risk | Medium (all-or-nothing) | Low (per-service) | Low (per-function) | | Scaling model | Vertical + horizontal | Independent per service | Automatic | | Vendor lock-in | Low | Low-Medium | High | | Best for | Most SaaS products | Large orgs, complex domains | Event processing, variable load |

Real case: Building a multi-tenant analytics SaaS

Last year, we built a B2B analytics platform for a client in the hospitality industry. The platform ingests event data from hotel properties (bookings, check-ins, amenity usage), processes it, and delivers dashboards and reports to hotel managers.

Why we chose a modular monolith with serverless extraction:

The core application — the dashboard, user management, report builder, and API — runs as a Next.js monolith deployed on AWS ECS. This gives us fast iteration on the product features that change weekly based on user feedback.

But the data ingestion pipeline is a completely different beast. Hotel properties send data in bursts (check-in rush at 3 PM, batch uploads at midnight). This is a perfect fit for serverless. So the ingestion layer runs on Lambda + SQS + S3, processing anywhere from 100 to 500,000 events per hour depending on the time of day.

┌─────────────────────────────────────────────────────┐
│                    Architecture                      │
│                                                      │
│  Hotel PMS → API Gateway → SQS → Lambda → S3        │
│                                      ↓               │
│                                  Transform           │
│                                      ↓               │
│                                  ClickHouse          │
│                                      ↑               │
│              Next.js Monolith ───────┘               │
│              (Dashboard + API)                        │
│              on ECS Fargate                           │
└─────────────────────────────────────────────────────┘

The result: the monolith handles all the human-facing features with sub-100ms response times, and the serverless pipeline handles the bursty machine-to-machine ingestion at a fraction of what always-on workers would cost.

Monthly cost at 50 hotel properties: approximately $180, compared to an estimated $600+ if we had run the ingestion pipeline on EC2 instances.

A decision framework

When a new SaaS project lands on our desk, we run through this framework:

Step 1: How many developers will work on this in 12 months?

Less than 8 → monolith is almost certainly the right call
8 to 20 → modular monolith, prepare boundaries for future extraction
20+ → microservices may be justified, but validate that you have distinct domain boundaries

Step 2: What does the load profile look like?

Steady, predictable → monolith or microservices on containers
Bursty or event-driven → serverless for those specific workloads
Mixed → hybrid approach (monolith core + serverless pipelines)

Step 3: What's the time-to-market pressure?

"We need to launch in 6 weeks" → monolith, no question
"We have 6 months and a clear spec" → you have options
"We're iterating on product-market fit" → monolith, and refactor later

Step 4: What does the team know?

This one matters more than people admit. A team of Rails experts shipping a Rails monolith in 3 weeks will beat a team learning Kubernetes for 3 months to deploy microservices. Optimize for what your team can execute on today.

The mistakes we see most often

Premature microservices. A 3-person startup splitting their app into 6 services because "Netflix does it." Netflix has 2,000 engineers. You have 3.
Ignoring data gravity. Your database is the hardest thing to split. If all your services query the same 5 tables, you don't have microservices — you have a distributed monolith with extra network hops.
Architecture as identity. Choosing an architecture because it's resume-driven development rather than because it solves a real problem. Serverless is not inherently better than a monolith. It's different.
Not planning for the transition. The best monoliths are designed to be extractable. Use dependency injection, enforce module boundaries, avoid circular dependencies. When the time comes to extract a service, it should be a matter of weeks, not months.
Over-indexing on scale. "But what if we get 10 million users?" If you get 10 million users, you'll have the revenue to re-architect. If you spend 6 months building for scale you never reach, you won't have a company.

Our recommendation

For 90% of the SaaS products we build: start with a modular monolith, extract services when the data justifies it. We've seen more startups die from over-engineering than from a monolith that scales "poorly."

The key isn't the perfect architecture — it's the architecture that lets you iterate fast and measure results. Ship the monolith, instrument everything, and let real production data tell you where the bottlenecks are. That's when you extract — not before.

The second most important thing we can tell you: your architecture will change. The best systems we've built are the ones that were designed to evolve. Clean boundaries, good tests, comprehensive monitoring. These matter more than whether you picked Lambda or ECS on day one.

Planning your platform? Let's talk about which architecture makes sense for your case.