What is the design a pastebin service system design question?

Master Pastebin system design for your next big tech interview. Covers unique key generation approaches and their trade-offs, content storage in object storage vs database, multi-layer caching with CDN and Redis, paste expiration strategies, rate limiting, private paste security, and the deep-dive questions Google, Meta, and Amazon interviewers ask.

Does Amazon ask design a pastebin service in interviews?

Yes. Amazon system design interviews centre on e-commerce infrastructure, distributed computing, delivery logistics, and high-availability services. They look for candidates who can design systems following their leadership principles, especially "think big" and "bias for action". Their interviews focus on: E-commerce platforms, delivery logistics, distributed systems, task scheduling, and monitoring.

How to Design a Pastebin Service

Pastebin is one of the most commonly asked system design questions at Google, Meta, Amazon, and Dropbox. It sits in that satisfying sweet spot where the problem feels approachable at first glance — store some text, return a URL — but has just enough depth to reveal exactly how clearly you can reason through distributed system trade-offs.

The interesting parts aren't the happy path. They're the questions underneath it: how do you generate millions of unique short URLs without collisions or predictability? Where does the actual text live — in a database or an object store — and what changes at scale? How do you serve a paste that goes viral to millions of readers without your origin server noticing? What happens to expired pastes, and how do you clean them up without hurting write performance?

Interviewers use this question to probe database selection reasoning, caching strategy, and — for more senior candidates — abuse prevention and private paste security. It rewards candidates who can make clean architectural decisions and articulate their reasoning.

This guide covers the full design, in the kind of back-and-forth you'd actually have in the interview room.

Step 1: Clarify the Scope

Interviewer: Design a Pastebin service.

Candidate: Before I start — a few quick questions to make sure I'm building the right thing. Are we supporting anonymous pastes only, or do users have accounts and can manage their paste history? Should pastes have expiration times — hours, days, or never? Do we need private pastes that are only accessible if you have the exact URL? Is there a maximum paste size? And are we thinking about full Pastebin scale — millions of pastes per day — or something more modest?

Interviewer: Support both anonymous and registered users. Yes to expiration — user-configured. Private pastes are in scope. Let's say max paste size of 1 MB. Assume Pastebin's actual scale.

Candidate: Perfect. That gives me the full picture. Let me work through requirements and numbers before jumping into the architecture.

Requirements

Functional

Users can create a paste (plain text or code) and receive a unique, shareable URL
Pastes can have an optional expiration time (1 hour, 1 day, 1 week, 1 month, never)
Pastes can be public (searchable, browseable) or private (accessible only via exact URL)
Registered users can view, edit, and delete their own pastes
Anonymous users can create pastes but cannot edit or manage them
Syntax highlighting for common programming languages (a rendering concern — mention briefly)

Non-Functional

High read availability — pastes must be readable even during partial failures
Low read latency — paste reads must return in under 100ms for cached content
Unique URLs — no two pastes can share the same URL
Durability — a created paste must not be lost (especially before it expires)
Abuse prevention — rate limit creation to prevent spamming and scraping

Back-of-the-Envelope Estimates

Interviewer: Give me the scale numbers.

Candidate: Let me work from Pastebin's approximate scale and extrapolate.

plaintext

Pastes created per day:     1 million
Pastes read per day:        100 million (100:1 read-to-write ratio)
 
Write QPS (average):        1M / 86,400s ≈ 12 writes/sec
Write QPS (peak, 5×):      ~60 writes/sec
 
Read QPS (average):         100M / 86,400s ≈ 1,160 reads/sec
Read QPS (peak, 5×):       ~5,800 reads/sec
 
Average paste size:         10 KB (mix of short snippets and larger code files)
Max paste size:             1 MB
 
Storage per day:            1M pastes × 10 KB = 10 GB/day
10-year storage:            10 GB × 365 × 10 = ~36.5 TB
 
With replication (3×):      ~110 TB over 10 years

Three things jump out. First, this is emphatically read-heavy — 100:1 means caching should be the first instinct for every read optimisation. Second, 12 writes per second is modest — the write path doesn't need to be exotic. Third, 36 TB of content over 10 years is too large for a relational database to comfortably hold in rows, which is going to drive a specific storage architecture decision.

High-Level Architecture

plaintext

         ┌──────────────────────────────────────────┐
         │          CDN (CloudFront/Fastly)          │
         │  Caches paste content at edge nodes       │
         │  Serves the majority of read traffic      │
         └──────────────────┬───────────────────────┘
                            │ CDN miss → origin
         ┌──────────────────▼───────────────────────┐
         │              API Gateway                  │
         │  (auth, rate limiting, routing)           │
         └──────────────────┬───────────────────────┘
                            │
         ┌──────────────────▼───────────────────────┐
         │           Application Servers             │
         │  (stateless — horizontally scalable)      │
         └──────┬────────────────────────┬──────────┘
                │                        │
    ┌───────────▼──────────┐  ┌──────────▼──────────────┐
    │  Metadata DB         │  │  Content Store           │
    │  (PostgreSQL)        │  │  (Amazon S3 / GCS)       │
    │  paste_id, user_id,  │  │  Actual text content     │
    │  created_at,         │  │  Keyed by paste_id       │
    │  expires_at,         │  └──────────────────────────┘
    │  visibility,         │
    │  size_bytes          │  ┌──────────────────────────┐
    └──────────────────────┘  │  Redis Cache             │
                              │  Hot paste metadata      │
    ┌──────────────────────┐  │  "Exists" lookups        │
    │  Key Generation      │  └──────────────────────────┘
    │  Service (KGS)       │
    │  Pre-generated pool  │  ┌──────────────────────────┐
    │  of unique IDs       │  │  Cleanup Worker          │
    └──────────────────────┘  │  Deletes expired pastes  │
                              └──────────────────────────┘

The Core Problem: Generating Unique Paste IDs

Every paste gets a short, unique URL — something like pastebin.com/aB3kR9xZ. This is the first topic interviewers dig into, and it has more depth than it first appears.

What the ID needs to be:

Short enough to be shareable (6–8 characters)
Unique across billions of pastes
Not predictable — you shouldn't be able to guess another user's private paste URL by incrementing a counter

Let's work through the three approaches and their trade-offs.

Interviewer: How do you generate unique paste IDs?

Candidate: There are three main approaches. Let me walk through each and explain why I'd settle on one over the others.

Option 1: Hash-Based Generation (MD5 / SHA-256 → truncate)

Take the paste content, run it through MD5 or SHA-256, encode the result in Base62, and take the first 8 characters as the ID.

The problem: two different pastes that happen to share the same first 8 characters of their encoded hash collide. More subtly, if you hash the content, two identical pastes would get the same URL — which could be a feature (deduplication) or a bug (user A's private paste has the same ID as user B's public one). Adding a salt (user ID + timestamp) fixes the deduplication issue but doesn't eliminate the truncation collision problem.

Verdict: don't use this for Pastebin. The collision probability is non-trivial at scale, and handling collisions at write time adds complexity without a compelling benefit.

Option 2: Random Generation with Collision Check

Generate a random 8-character Base62 string. Attempt to insert it into the database. If the insert fails due to a primary key conflict, generate a new one and retry.

plaintext

paste_id = random_base62(8)  # e.g. "aB3kR9xZ"
try:
    INSERT INTO pastes (paste_id, ...) VALUES (paste_id, ...)
except DuplicateKeyError:
    retry with new random paste_id

The math: Base62 with 8 characters gives 62⁸ ≈ 218 trillion possible IDs. With 1 billion pastes in the system, the probability of a collision on any given insert is about 1 in 218,000. Retries are vanishingly rare. The approach is simple, stateless, and secure — the IDs are unpredictable.

The verdict: this is clean and correct for this scale. The application server detects collisions via the database's primary key constraint — no separate lookup needed.

Option 3: Key Generation Service (KGS)

A dedicated KGS pre-generates millions of random Base62 IDs and stores them in a database with two tables: unused_keys and used_keys. When an application server needs a new ID, it requests one from the KGS, which atomically moves it from unused to used and returns it. No collision checking at write time — the KGS guarantees uniqueness.

Advantages:

Paste creation never fails due to a collision
ID generation is decoupled from the paste write path
Application servers can cache a small batch of keys in memory, reducing KGS round-trips

Disadvantages:

The KGS is a new service to build, deploy, and monitor
The KGS is a potential single point of failure (mitigated by replicas, but complexity increases)
If a KGS instance crashes with keys loaded into memory, those keys are wasted — acceptable at this scale given the enormous key space

The verdict: KGS is worth introducing if write throughput requires it (e.g., millions of pastes per second, where collision retries become frequent). At Pastebin's actual scale of ~12 writes/second, random generation with a collision check is simpler and equally correct.

Candidate summary: I'd go with random Base62 generation with a collision check at this scale. If write throughput grew significantly — say, to thousands per second — I'd introduce a KGS to eliminate retry overhead. The KGS approach is the right answer to mention as the natural evolution.

The key generation discussion is one of those sections where interviewers want to see you walk through options systematically rather than just naming one. Evaluating the collision math, naming the predictability concern, and presenting the KGS as a natural scaling evolution — in that sequence — is what "senior-level thinking" sounds like here. If you want to rehearse exactly that kind of structured reasoning under time pressure, Mockingly.ai is worth checking out.

A Note on Base62 vs Base64 vs Base58

Interviewer: Why Base62 specifically?

Candidate: Base64 adds + and / — characters that aren't URL-safe and need to be percent-encoded in a URL, which looks messy. Base62 uses only [A-Z][a-z][0-9] — all URL-safe, all readable.

Base58 is Base62 minus visually ambiguous characters: uppercase O vs zero 0, uppercase I vs lowercase l. Bitcoin addresses use Base58 for exactly this reason — typos matter. For a Pastebin URL, the system generates the ID and the user copies it — they're not usually typing it character by character — so Base62 is fine.

Storage: The Key Architectural Split

This is the most important design decision in the system, and it follows directly from the scale estimates.

Interviewer: Where do you store the paste content?

Candidate: Split storage: metadata in PostgreSQL, content in object storage (S3 or GCS).

Here's the reasoning. A paste is two fundamentally different things: its metadata — paste ID, owner, creation time, expiration, visibility, size — and its actual content, which can be anywhere from 50 bytes to 1 MB of text.

Storing 36 TB of blob content in PostgreSQL rows works at small scale but creates problems as you grow: large rows slow down full-table scans, backups become unwieldy, and column-oriented optimisations don't help when each row is a variable-length blob. PostgreSQL wasn't designed to be a blob store.

Amazon S3 and GCS were designed for exactly this. Object storage scales to petabytes without a schema, costs significantly less per GB than database storage, has built-in replication and durability guarantees (eleven 9s durability for S3), and pairs naturally with a CDN for low-latency global delivery.

The split looks like this:

plaintext

PostgreSQL row:
  paste_id   TEXT PRIMARY KEY,        -- "aB3kR9xZ"
  user_id    UUID,                    -- null for anonymous
  created_at TIMESTAMPTZ NOT NULL,
  expires_at TIMESTAMPTZ,             -- null = never expires
  visibility TEXT NOT NULL,           -- 'public' or 'private'
  size_bytes INT NOT NULL,
  language   TEXT                     -- for syntax highlighting hints
 
S3 object:
  Key:   pastes/aB3kR9xZ
  Value: [raw text content, up to 1 MB]

When a user reads a paste, the application fetches metadata from PostgreSQL (or Redis cache) to check it exists and hasn't expired, then returns the content from S3 via a signed URL or directly via CDN. Write path: generate ID, write content to S3, write metadata row to PostgreSQL. In that order — content first, then metadata. If we wrote metadata first and the S3 write failed, we'd have a dangling metadata row pointing to non-existent content.

Interviewer: What about small pastes — a 50-byte snippet? Is S3 overkill?

Candidate: For very small pastes, you could inline the content in the PostgreSQL row alongside the metadata. Some systems use a hybrid: content under, say, 10 KB is stored in the database; content above that threshold goes to object storage. This reduces S3 requests for the common case. The trade-off is a more complex application layer that needs to know which path to take. I'd start with S3 for everything for simplicity — the read path uses CDN caching anyway, so the S3 latency is paid only on cache misses.

Caching Strategy

With a 100:1 read-to-write ratio, cache performance is what separates a good Pastebin design from an excellent one.

Interviewer: How do you serve 5,800 peak reads per second efficiently?

Candidate: Two layers, with CDN doing the heavy lifting.

Layer 1: CDN (CloudFront, Fastly, or Cloudflare)

Paste content is immutable once written — a paste never changes. This is ideal for CDN caching. Set Cache-Control: public, max-age=3600 for public pastes and the CDN will cache the content at edge nodes globally. The first user to read a paste triggers a CDN miss and an S3 fetch. Every subsequent reader in that region hits the CDN edge — the origin never sees them.

For a paste that goes viral — a popular code snippet shared on Hacker News — the CDN absorbs essentially all the traffic. The origin might see a handful of cache-warming requests; the CDN handles millions.

Private pastes are not CDN-cached — Cache-Control: private, no-store. The CDN passes them through to origin every time. Private pastes are accessed infrequently by design (only the people with the exact URL), so the absence of CDN caching is acceptable.

Layer 2: Redis (metadata cache)

Every read needs to validate: does this paste exist? Has it expired? Is it public or private? These are small metadata lookups against PostgreSQL. At 5,800 reads/second, that's 5,800 PostgreSQL queries per second — significant but manageable. Adding Redis caches the metadata:

plaintext

Key:   metadata:aB3kR9xZ
Value: { user_id, expires_at, visibility, size_bytes }
TTL:   1 hour (or until the paste expires, whichever is sooner)

On cache hit: validate and serve from CDN. On cache miss: query PostgreSQL, populate Redis, serve from CDN. On paste deletion or expiration: delete the Redis key immediately.

This is an important nuance: when a paste expires or is deleted by the user, the Redis key must be invalidated proactively. If it isn't, a request for a deleted paste could get a cache hit from Redis saying "exists," then get a 404 from S3. Delete the Redis key first, then the S3 object, then the PostgreSQL row.

Interviewer: How does the CDN handle paste expiration?

Candidate: This is a subtle problem. The CDN has cached a paste. The paste expires. The CDN doesn't know — it just serves from its cache until the TTL expires.

Three mitigations. First, set CDN TTL to the paste's remaining lifetime, not a fixed value. If a paste expires in 30 minutes, max-age=1800. The CDN evicts it naturally. Second, on paste deletion or expiration, issue a CDN cache purge API call for that URL — most CDN providers support programmatic purge. Third, the application-level check: even for CDN-cached responses, the metadata Redis check runs first in the request path. If the Redis TTL has expired and PostgreSQL shows the paste as expired, return a 404 regardless of what the CDN has cached.

In practice, the CDN TTL alignment is the cleanest solution. Purge APIs are a backup for immediate-deletion cases.

The CDN expiration question — "what happens to cached content when a paste expires?" — is a follow-up that catches many candidates off guard because it sits at the intersection of CDN behaviour and product correctness. The three-part answer (TTL alignment, purge API, application-level check) covers all the bases. Getting comfortable explaining all three in order is the kind of detail Mockingly.ai helps you pressure-test before the real interview.

Paste Expiration and Cleanup

Interviewer: How do you handle paste expiration at scale?

Candidate: Two mechanisms working together.

At read time: before serving a paste, check expires_at. If it's in the past, return a 404. Don't wait for a cleanup job to remove it — the check is cheap and ensures users never see expired content.

Background cleanup worker: a periodic job (runs every hour, or every few hours) scans PostgreSQL for expired pastes and deletes them:

sql

-- Efficient index-driven cleanup
DELETE FROM pastes
WHERE expires_at < NOW()
  AND expires_at IS NOT NULL  -- don't touch "never expire" pastes
LIMIT 1000;  -- batch to avoid long-running transactions

The LIMIT 1000 is critical. Without it, a single cleanup run could delete millions of rows in one transaction, holding locks and causing write latency spikes. Batching keeps the cleanup gentle.

For each deleted paste row, the worker also deletes the corresponding S3 object. S3 object deletion is also batched — S3's delete objects API accepts up to 1,000 keys per request.

Interviewer: Why not use S3's built-in object expiration (lifecycle policies)?

Candidate: You could, and it's a valid approach. S3 lifecycle rules can automatically delete objects after a set time. The issue is that the rule is set once per prefix or bucket, not per-object dynamically. Every paste has a different expiration time chosen by the user. You'd need to put each paste in a differently-configured "folder" or bucket based on its expiration window, which gets complicated.

The background worker approach is more flexible — it reads the per-paste expires_at from PostgreSQL and deletes accordingly. S3 lifecycle policies work well as a fallback for any objects the worker might have missed — set a 30-day maximum TTL on the S3 bucket as a safety net.

Private Pastes and Access Control

Interviewer: How do you keep private pastes private?

Candidate: The security model for private pastes is "security through obscurity by design" — the URL is the access credential. There's no access control list, no login check. If you have the URL, you can read the paste.

This works because the URL contains a randomly generated 8-character Base62 ID. With 218 trillion possible IDs and only, say, 100 million private pastes in the system, the probability of guessing a valid private paste URL by brute force is about 0.00005% per attempt. To make this attack practical, an attacker would need to make billions of requests — which rate limiting prevents.

That said, there are two additional guards worth naming.

HTTPS always. A private paste URL in a plain HTTP response is visible to network eavesdroppers. HTTPS is non-negotiable.

No-index headers on private pastes. The application sets X-Robots-Tag: noindex on private paste responses. This prevents search engines from indexing the URL if someone accidentally shares it. It doesn't prevent access, but it stops the URL from appearing in Google search results.

For pastes with higher sensitivity (password-protected pastes, a premium feature): encrypt the content before storing it in S3. The decryption key is derived from a user-provided password and is never stored server-side. The server stores ciphertext; only the client with the password can decrypt. This is true end-to-end protection, not just obscurity.

Rate Limiting and Abuse Prevention

Pastebin is a public-facing text storage service. Without protection, it becomes a spam host, a malware distribution vector, or a DDoS amplification source.

Interviewer: How do you prevent abuse?

Candidate: Rate limiting at the API Gateway level, with per-IP and per-user limits.

For anonymous users (IP-based limiting):

plaintext

Create paste:  10 per hour per IP
Read paste:    1,000 per hour per IP

Anonymous creation is deliberately strict. A legitimate user rarely needs to create more than 10 pastes per hour. A spammer trying to bulk-create pastes hits the limit immediately.

For registered users:

plaintext

Create paste:  100 per hour
Read paste:    10,000 per hour

Registered users have more headroom. Accounts are rate-limited individually, not by IP, so rotating IPs doesn't help attackers.

Storage quota: each registered user gets a storage quota — say, 100 MB. Anonymous pastes are ephemeral by nature (they expire, or they're one-off). Registered users accumulate pastes indefinitely, so a quota prevents storage abuse.

Content moderation: for a production Pastebin, you'd scan paste content for malware signatures, copyright violations, and illegal content using automated tools before storing. At creation time, run the content through a hash against known-bad content lists (like CSAM hash databases). At minimum, implement keyword scanning for obvious abuse patterns. This is operationally complex but required for any public-facing text storage service.

Abuse prevention is often the section where interviewers probe follow-ups — "what if someone rotates IPs?", "how do you handle CSAM?", "can a registered user bypass the quota with multiple accounts?". These are exactly the kinds of edge-case questions that Mockingly.ai surfaces in its simulations, which makes it a good way to find the gaps in your prep before the actual interview.

API Design

Interviewer: What does the API look like?

Candidate: REST-based with clean resource modelling.

plaintext

POST   /pastes
  Body: { content, language, visibility, expires_in }
  Response: { paste_id, url, expires_at }
 
GET    /pastes/{paste_id}
  Response: { content, language, created_at, expires_at }
  Headers: Cache-Control: public/private based on visibility
 
DELETE /pastes/{paste_id}
  Auth required (owner only)
  Response: 204 No Content
 
GET    /users/{user_id}/pastes
  Auth required
  Response: paginated list of paste metadata (not content)
  Pagination: cursor-based on created_at
 
GET    /pastes/recent
  Returns recent public pastes (the "archive" feature)
  No auth required
  Response: paginated list of public paste metadata

One thing worth noting: GET /pastes/{paste_id} returns the content directly in the JSON response for small pastes. For large pastes (say, over 100 KB), the API instead returns a pre-signed S3 URL and the client fetches the content directly from S3/CDN. This keeps the API server from becoming a bandwidth bottleneck for large content.

Scaling Discussion

Interviewer: How does this system scale to 10× current traffic?

Candidate: The read path scales almost automatically. CDN capacity is elastic — CloudFront or Fastly don't have capacity limits you'd hit at 10× Pastebin traffic. Redis adds nodes horizontally. The application servers are stateless and scale behind a load balancer.

The write path at 10× — 120 writes/second — is still very manageable for PostgreSQL. A single well-tuned PostgreSQL primary with read replicas handles this without issue.

Where scaling gets interesting: if we hit 100× (1,200 writes/second), we'd think about sharding PostgreSQL by paste_id (using a hash of the first character as a simple shard key). S3 handles unlimited write throughput natively — no concern there.

The cleanup worker becomes more critical at scale. With 1 billion pastes and millions expiring daily, the cleanup job needs to be distributed — multiple workers each claiming a time bucket of expired pastes (e.g., Worker 1 handles pastes expiring between noon and 1 PM, Worker 2 handles 1–2 PM, etc.).

Common Interview Follow-ups

"How would you support syntax highlighting?"

Syntax highlighting is a client-side or CDN-edge concern, not a storage concern. The API returns a language field alongside the content. The client-side JavaScript library (like Highlight.js or Prism.js) renders the content with appropriate coloring in the browser. No server-side processing needed. Storing the language tag alongside the paste metadata is the only backend work — one extra column in the PostgreSQL table.

"What if the same content is pasted many times — say, a popular code snippet?" Should you deduplicate?

Content-addressable storage (storing content once and pointing multiple paste IDs to the same S3 object) could save storage. But it creates significant complexity: if one paste expires, you can't delete the S3 object because another paste points to it. Reference counting adds concurrency bugs. The storage savings don't justify the complexity at Pastebin's actual data volumes. Store each paste independently and save the engineering complexity.

"How would you implement the 'recent public pastes' feed without hitting the database on every read?"

Maintain a sorted set in Redis: ZADD recent_public_pastes {timestamp} {paste_id}. When a public paste is created, add it to the sorted set. The recent pastes API reads from this set — no database query. The set is bounded (keep the last 1,000 public pastes). Old entries fall off naturally. This is an append-only write and a range read — exactly what Redis sorted sets are optimised for.

"What if a paste URL is shared on social media and suddenly gets 10 million reads in an hour?"

This is the viral paste problem, and the CDN handles it entirely. Once the first request warms the CDN cache, all subsequent requests are served from the nearest CDN edge node. Ten million requests per hour to a CDN is routine — Cloudflare serves trillions of requests per month. The only limit is CDN capacity, which is elastic. The origin server might see a few hundred cache-warming requests across global CDN nodes; the rest never reach the application servers.

"How do you handle pastes that are slightly over 1 MB? Do you hard reject them?"

In practice, you'd likely apply client-side validation (the UI prevents submission of pastes over 1 MB) and server-side validation (return HTTP 413 Request Entity Too Large if the payload exceeds the limit). The 1 MB limit is a product decision — large enough for any reasonable code snippet, small enough to prevent abuse. For use cases requiring large content (full log files, database dumps), you'd point users toward a different service (like S3 direct upload or a document storage product).

Quick Interview Checklist

✅ Clarified scope — anonymous vs registered users, expiration, private pastes, paste size limit
✅ Back-of-the-envelope — 100:1 read-to-write ratio, 36 TB over 10 years, 12 writes/sec modest
✅ Key generation — three approaches evaluated; random Base62 + collision check recommended at this scale; KGS as natural next step
✅ Base62 over Base64 — URL-safe; Base58 trade-off named
✅ Split storage — metadata in PostgreSQL, content in S3; write content first, then metadata
✅ CDN for content — immutable pastes are perfect for edge caching; TTL aligned to expiration
✅ Redis for metadata — fast "exists and not expired" lookups; invalidate on delete/expiry
✅ Private pastes bypass CDN — Cache-Control: private, no-store
✅ Expiration — read-time check + background cleanup worker with LIMIT batching
✅ S3 lifecycle policy as fallback safety net
✅ Private paste security — URL as credential, HTTPS required, no-index headers, optional password encryption
✅ Rate limiting — strict anonymous limits, more headroom for registered users, storage quota
✅ Content moderation — hash-based abuse detection mentioned
✅ API design — REST, cursor pagination, signed URL for large content
✅ Scaling — reads scale via CDN elasticity; writes scale via PostgreSQL sharding at extreme scale

Conclusion

Pastebin is an excellent interview question because its simplicity is deceptive. The core operations — write text, return URL, serve text — are trivial. The interesting work is in the decisions underneath: the key generation strategy that avoids both collisions and predictability, the storage split that makes the system economically viable at petabyte scale, the CDN design that handles viral pastes without touching the origin, and the expiration mechanism that keeps the storage from growing unboundedly.

Getting these decisions right — and being able to explain the trade-offs clearly — is what a strong answer looks like.

The design pillars:

Random Base62 generation with database collision detection — simple, secure, correct at this scale; KGS as the natural evolution for higher write throughput
PostgreSQL for metadata, S3 for content — each tool doing what it's designed for; write content before metadata to avoid dangling rows
CDN as the primary read path — immutable content is perfect for edge caching; TTL alignment eliminates stale-content problems
Redis for metadata validation — fast "does this paste exist and is it still valid?" without hitting PostgreSQL on every request
Read-time expiration check + batched background cleanup — users never see expired content; cleanup is gentle enough not to impact write performance
Rate limiting at the gateway — strict anonymous limits, account-level limits for registered users, content quota

Frequently Asked Questions

How do you generate unique paste IDs — and what is the difference between random generation and a KGS?

A paste ID must be short, globally unique, and unpredictable. There are three approaches, each with different trade-offs.

Approach	How it works	Pros	Cons
Hash-based	Hash paste content (MD5/SHA-256), truncate to 8 chars, encode in Base62	No database roundtrip	Collision probability non-trivial at scale; same content = same ID
Random Base62 + collision check	Generate random 8-char Base62 string; retry if DB primary key conflict	Simple, stateless, unpredictable	Rare retries on collision
Key Generation Service (KGS)	Dedicated service pre-generates unique IDs; atomically moves from unused to used	Zero collision retries; batch caching possible	New service to build and operate; SPOF risk

Which to choose:

At Pastebin's actual scale (~12 writes/second), random Base62 with a collision check is correct and simple. With 62⁸ ≈ 218 trillion possible IDs and 1 billion pastes in the system, collision probability per insert is ~1 in 218,000 — retries are vanishingly rare
At higher write throughput (thousands per second), KGS eliminates retry overhead and is worth the operational complexity
Hash-based is the wrong choice — identical content would share an ID (a security problem for private pastes), and truncation collisions are undetectable at write time

Why store paste content in S3 instead of PostgreSQL?

PostgreSQL is optimised for structured queries with indexes. S3 is optimised for storing and retrieving large blobs cheaply at any scale. Mixing blob content into a relational database creates problems that grow with volume.

Why PostgreSQL fails for paste content at scale:

Row bloat — storing up to 1 MB of text per row creates wide rows that slow full-table scans and vacuums
Index thrash — each new paste insert updates B-tree indexes with variable-length content
Backup complexity — PostgreSQL backups grow linearly with content volume; 36 TB of blob content makes backup windows impractical
Cost per GB — PostgreSQL storage (SSD-backed) costs significantly more than S3 object storage

Why S3 is the right choice:

S3 provides eleven 9s of durability with built-in replication across availability zones
Cost is ~$0.023/GB/month — orders of magnitude cheaper than database storage
Pairs naturally with CDN — serve content directly from the edge without touching the origin
Scales to petabytes without schema changes or manual sharding

The split: PostgreSQL holds the metadata row (paste_id, user_id, expires_at, visibility — ~200 bytes). S3 holds the content blob (up to 1 MB). Write order: content to S3 first, then metadata to PostgreSQL. If the S3 write fails, no dangling metadata row exists.

What is Base62 and why is it used for paste IDs over Base64 or Base58?

Base62 uses characters [A-Z][a-z][0-9] — 62 characters total. It is the standard choice for URL-safe short IDs because all 62 characters are safe in a URL without percent-encoding.

Why not Base64:

Base64 adds + and / (standard) or - and _ (URL-safe variant)
Standard Base64 in a URL becomes pastebin.com/aB3+R9/Z — the + and / need percent-encoding → pastebin.com/aB3%2BR9%2FZ
URL-safe Base64 avoids this but is less commonly implemented consistently across clients

Why not Base58:

Base58 removes visually ambiguous characters: 0 (zero) vs O (letter O), 1 (one) vs I (letter I), l (lowercase L)
Bitcoin addresses use Base58 to prevent human transcription errors
For Pastebin, users copy-paste the URL — they don't type it character by character — so visual ambiguity doesn't matter
Base62 gives a slightly larger ID space for the same character count

Practical capacity:

8-character Base62: 62⁸ ≈ 218 trillion unique IDs
8-character Base64: 64⁸ ≈ 281 trillion unique IDs
6-character Base62: 62⁶ ≈ 56 billion unique IDs (still sufficient)

How does CDN caching work for a Pastebin service?

Paste content is immutable once written — a paste never changes after creation. Immutable content is the ideal CDN use case: cache once, serve forever until the TTL expires.

How CDN caching works for public pastes:

First request to cdn.pastebin.com/aB3kR9xZ → CDN miss → fetches from S3 → caches at edge node
All subsequent requests in that region → CDN hit → served from edge, origin never contacted
Cache-Control: public, max-age=<paste_remaining_lifetime> — TTL aligned to expiration

Why aligning TTL to expiration matters:

If a paste expires in 30 minutes and you set max-age=86400, the CDN will serve expired content for up to 24 hours after it has logically expired. Setting max-age=1800 (30 minutes) ensures the CDN evicts the content at the same time PostgreSQL considers it expired.

Private pastes are never CDN-cached:

plaintext

Cache-Control: private, no-store

Private pastes are served directly from origin on every request. This is acceptable — private pastes are accessed infrequently by design (only people who have the exact URL).

Handling immediate deletion:

If a user deletes a paste before its CDN TTL expires, issue a CDN cache purge API call for that URL. Most CDN providers (Cloudflare, Fastly, CloudFront) support programmatic purge. The application-level metadata check (Redis + PostgreSQL) also acts as a backstop — a request for a deleted paste returns 404 even if the CDN has it cached.

How do you expire and clean up pastes efficiently?

Two mechanisms work together: a read-time expiration check that prevents users from ever seeing expired content, and a background cleanup worker that physically removes expired data in batches.

Read-time check (immediate):

Before serving any paste, the application checks expires_at:

sql

SELECT expires_at, visibility FROM pastes WHERE paste_id = $id;
-- If expires_at < NOW(): return 404

This check runs on the Redis metadata cache — sub-millisecond. Users never see expired content regardless of when the cleanup worker last ran.

Background cleanup worker (eventually):

sql

DELETE FROM pastes
WHERE expires_at < NOW()
  AND expires_at IS NOT NULL
LIMIT 1000;

Runs every hour. The LIMIT 1000 batches the deletion — without it, a single run could delete millions of rows in one transaction, holding locks and causing write latency spikes.

For each deleted PostgreSQL row, the worker also deletes the corresponding S3 object (using S3's batch delete API, up to 1,000 keys per request) and purges the Redis cache key.

S3 lifecycle policy as a safety net:

Set a 30-day maximum TTL on the S3 bucket. Any object the cleanup worker missed gets deleted automatically. Belt-and-suspenders.

Private pastes use the URL itself as the access credential. The security model is "security through unguessability" — the 8-character random Base62 ID is the password.

Why this works:

62⁸ ≈ 218 trillion possible IDs
With 100 million private pastes in the system, the probability of guessing a valid private paste URL at random is ~0.00005% per attempt
To find a valid private paste by brute force, an attacker needs billions of attempts — which rate limiting blocks

Three additional guards:

HTTPS always — a private paste URL transmitted over HTTP is visible to network eavesdroppers. HTTPS is non-negotiable for private pastes
No-index headers — X-Robots-Tag: noindex prevents search engines from indexing the URL if it's accidentally shared publicly
Optional password encryption — for higher-sensitivity content (a premium feature): encrypt the paste content with a key derived from a user-provided password before storing in S3. The decryption key is never stored server-side — only the client with the password can decrypt

The security model is the same as Dropbox's "share link" — knowing the URL is sufficient. It's not appropriate for highly sensitive regulated data, but it's the right trade-off for a general-purpose paste service.

How does rate limiting work for a Pastebin service?

Rate limiting at the API Gateway uses different limits for anonymous and registered users, enforced via a Redis sliding window or token bucket per identifier.

Anonymous users (IP-based):

Operation	Limit
Create paste	10 per hour per IP
Read paste	1,000 per hour per IP

Anonymous creation is deliberately strict — legitimate users rarely need more than 10 pastes per hour. Spammers hit the limit immediately.

Registered users (account-based):

Operation	Limit
Create paste	100 per hour
Read paste	10,000 per hour
Storage quota	100 MB total

Registered users have more headroom. Limits are applied per account, not per IP — rotating IPs does not help attackers.

Implementation with Redis:

plaintext

Key: rate:{user_id_or_ip}:{operation}:{hour_bucket}
Value: request count
TTL: 1 hour (sliding window)

INCR is atomic — no race conditions. The gateway checks the count before processing the request and rejects with HTTP 429 Too Many Requests if the limit is exceeded.

Which companies ask the Pastebin system design question in interviews?

Google, Meta, Amazon, Microsoft, Dropbox, and GitHub ask variants of this question for software engineer and senior software engineer roles.

Why it is a popular interview question:

Deceptively simple — the happy path (store text, return URL) is trivial. The depth is in the decisions underneath: key generation, storage split, CDN expiration handling, and private paste security
Tests first-principles reasoning — there is no single "correct" architecture. Interviewers want to see you evaluate options (random vs KGS, S3 vs database) and defend your choices
Covers core concepts compactly — key generation, CDN caching, object storage vs relational database, rate limiting, and expiration all appear in a 45-minute interview without becoming overwhelming

What interviewers specifically listen for:

Three key generation approaches evaluated — not just "use Base62", but the collision math, predictability concern, and when KGS is worth the complexity
Content in S3, not PostgreSQL — with the specific reasoning (blob sizes, cost per GB, CDN integration)
CDN TTL aligned to paste expiration — and the three-part answer for immediate deletion
Write content before metadata — explaining the ordering prevents dangling metadata rows
Private paste security model — URL as credential, HTTPS requirement, and password encryption as the premium tier

If any of those five feel uncertain when explaining them live, Mockingly.ai has Pastebin and URL shortener simulations where these exact follow-up questions come up — built for engineers preparing for roles at Google, Meta, Amazon, and Dropbox.

Pastebin looks like a beginner question but interviewers use it to probe your instincts on storage decisions, caching architecture, and abuse prevention — topics that come up in almost every senior interview regardless of domain. If you want to pressure-test your reasoning on these topics before the real interview, Mockingly.ai has a full library of system design simulations — from classic problems like Pastebin and URL shorteners all the way to more complex distributed systems questions asked at Google, Meta, Amazon, and Dropbox.

Design a Pastebin Service — Amazon Interview

How Amazon Tests This

How to Design a Pastebin Service

Step 1: Clarify the Scope

Requirements

Functional

Non-Functional

Back-of-the-Envelope Estimates

High-Level Architecture

The Core Problem: Generating Unique Paste IDs

Option 1: Hash-Based Generation (MD5 / SHA-256 → truncate)

Option 2: Random Generation with Collision Check

Option 3: Key Generation Service (KGS)

A Note on Base62 vs Base64 vs Base58

Storage: The Key Architectural Split

Caching Strategy

Paste Expiration and Cleanup

Private Pastes and Access Control

Rate Limiting and Abuse Prevention

API Design

Scaling Discussion

Common Interview Follow-ups

Quick Interview Checklist

Conclusion

Frequently Asked Questions

How do you generate unique paste IDs — and what is the difference between random generation and a KGS?

Why store paste content in S3 instead of PostgreSQL?

What is Base62 and why is it used for paste IDs over Base64 or Base58?

How does CDN caching work for a Pastebin service?

How do you expire and clean up pastes efficiently?

How does rate limiting work for a Pastebin service?

Which companies ask the Pastebin system design question in interviews?

Companies That Ask This

Ready to Practice?

Related System Design Guides

Design a Rate Limiter

Design a Real-Time Messaging System

Design a Distributed ID Generator

How Amazon Tests This

How to Design a Pastebin Service

Step 1: Clarify the Scope

Requirements

Functional

Non-Functional

Back-of-the-Envelope Estimates

High-Level Architecture

The Core Problem: Generating Unique Paste IDs

Option 1: Hash-Based Generation (MD5 / SHA-256 → truncate)

Option 2: Random Generation with Collision Check

Option 3: Key Generation Service (KGS)

A Note on Base62 vs Base64 vs Base58

Storage: The Key Architectural Split

Caching Strategy

Paste Expiration and Cleanup

Private Pastes and Access Control

Rate Limiting and Abuse Prevention

API Design

Scaling Discussion

Common Interview Follow-ups

Quick Interview Checklist

Conclusion

Frequently Asked Questions

How do you generate unique paste IDs — and what is the difference between random generation and a KGS?

Why store paste content in S3 instead of PostgreSQL?

What is Base62 and why is it used for paste IDs over Base64 or Base58?

How does CDN caching work for a Pastebin service?

How do you expire and clean up pastes efficiently?

How are private pastes kept private if there's no login check?

How does rate limiting work for a Pastebin service?

Which companies ask the Pastebin system design question in interviews?

Companies That Ask This

Ready to Practice?

Related System Design Guides

Design a Rate Limiter

Design a Real-Time Messaging System

Design a Distributed ID Generator