1. Basics
-
⚡ What is System Design?
Designing the architecture of large-scale applications considering scalability, reliability, and maintainability. -
Difference between Monolithic and Microservices architecture?
Monolithic: Single deployable unit; Microservices: Multiple independent services. -
What are the main components of a system?
Client, Server, Database, Cache, Load Balancer, Queue. -
What is horizontal vs vertical scaling?
Horizontal: Add more machines; Vertical: Upgrade existing machine.
2. Scalability
-
How do you scale an application?
Scale vertically or horizontally, use caching, database sharding, and load balancers. -
What is the difference between strong and eventual consistency?
Strong: All nodes see the same data immediately; Eventual: Nodes eventually converge. -
⚡ How to handle high traffic spikes?
Auto-scaling, load balancing, queue-based asynchronous processing.
3. Load Balancing
-
What is a Load Balancer?
Distributes incoming requests across multiple servers to improve availability. -
Types of Load Balancers?
Layer 4 (TCP), Layer 7 (HTTP), DNS-based, Hardware vs Software. -
⚡ How does sticky session work?
Routes user requests to the same server using cookies or session IDs.
4. Caching
-
Why use caching?
Reduce latency and offload database queries. -
Types of cache?
In-memory (Redis, Memcached), CDN, Browser cache. -
⚡ Cache invalidation strategies?
Time-to-Live (TTL), Write-through, Write-behind, Manual purge.
5. Database Design
-
SQL vs NoSQL?
SQL: Structured, ACID; NoSQL: Flexible schema, horizontal scaling. -
⚡ When to use a relational database?
For transactions, complex queries, and strict consistency. -
⚡ When to use a NoSQL database?
For high write throughput, flexible schema, and horizontal scaling.
6. Indexing & Query Optimization
-
What is an index?
Data structure to improve query speed. -
Types of indexes?
B-Tree, Hash, Composite, Full-text. -
⚡ How to optimize slow queries?
Indexing, query rewriting, denormalization, caching frequent queries.
7. Microservices & SOA
-
What is Microservices architecture?
Small, independent services communicating via APIs. -
⚡ Difference between SOA and Microservices?
SOA: Shared enterprise service bus; Microservices: Lightweight, decentralized. -
How to handle inter-service communication?
Synchronous (REST/gRPC) or Asynchronous (Message Queues).
8. Message Queues & Asynchronous Processing
-
Why use a message queue?
Decouple services, handle async workloads, buffer spikes. -
Popular message queues?
Kafka, RabbitMQ, SQS. -
⚡ How to ensure message delivery?
At-least-once, at-most-once, or exactly-once semantics.
9. API Design & Versioning
-
REST vs GraphQL?
REST: Fixed endpoints; GraphQL: Client-defined queries. -
API versioning strategies?
URL versioning (/v1), Header versioning, Query parameter versioning. -
⚡ What is idempotency?
Multiple identical requests have the same effect as one.
10. Consistency & Availability
-
CAP Theorem?
A distributed system can guarantee only two of Consistency, Availability, Partition tolerance. -
Consistency models?
Strong, Eventual, Causal, Read-your-write. -
⚡ How to maintain availability during partition?
Accept eventual consistency or implement failover strategies.
11. Partitioning & Sharding
-
What is database sharding?
Splitting database into smaller, faster, manageable parts. -
Types of sharding?
Horizontal, Vertical, Directory-based, Key-based. -
⚡ Why use partitioning?
Improve performance and scalability.
12. Rate Limiting
-
What is rate limiting?
Restrict number of requests per client in a time frame. -
Common strategies?
Token bucket, Leaky bucket, Fixed window, Sliding window.
13. Monitoring & Logging
-
Why monitor systems?
Detect issues, track performance, ensure reliability. -
Popular tools?
Prometheus, Grafana, ELK Stack, Datadog. -
⚡ Key metrics to monitor?
Latency, throughput, error rates, CPU/memory usage.
14. Security & Authentication
-
Common auth methods?
OAuth2, JWT, API keys, SAML. -
⚡ How to secure APIs?
HTTPS, authentication, authorization, input validation.
15. Production-Level Scenarios
-
⚡ How to handle sudden traffic spikes?
Auto-scaling, queue buffering, CDN, caching. -
⚡ How to reduce latency?
Caching, DB optimization, CDN, asynchronous processing. -
⚡ How to ensure high availability?
Load balancing, replication, failover, multi-region deployment. -
⚡ How to recover from disaster?
Backups, multi-region replication, automated failover. -
⚡ How to handle data consistency in distributed systems?
Use consensus protocols, replication with quorum, or eventual consistency. -
⚡ How to debug production issues?
Centralized logging, monitoring alerts, profiling tools.
16. Best Practices
- Design for scalability from day one.
- Use caching wisely; avoid cache stampede.
- Keep services decoupled and stateless when possible.
- Monitor key metrics and set up alerts.
- Prefer asynchronous communication for high-load operations.
- Ensure proper API versioning and backward compatibility.
- Plan disaster recovery and failover strategies.
- Optimize database queries and index critical columns.
17. Commands / Tools Cheat Sheet
- Load Testing:
Apache JMeter,Locust,hey,wrk - Monitoring & Logging:
Prometheus,Grafana,ELK Stack,Datadog - Architecture Diagrams:
draw.io,Lucidchart,Miro - Database Optimization:
EXPLAIN(MySQL/PostgreSQL),ANALYZE,INDEXcreation - Message Queues:
kafka-console-producer,kafka-console-consumer, RabbitMQ CLI
⚡ Tip: Focus on designing systems that are scalable, reliable, and maintainable; interviewers value trade-offs discussion over “perfect” solutions.