Learn System Design

5 min readAug 14, 2023

System Design is a set of things working together to achieve the objective of the system. Eg. Depending on objective (Max Revenue, Minimum journey Time) the railway system design will vary from Point A to Point B. Also, every system comes with own set of constraints. In system design or system architecture, we define how software & hardware components will work together to achieve objectives.

PMs should learn this because:

Any system will involve trade-offs since it has time, money or resource constraints. SD knowledge will add value to decision making.
SD defines user experience. eg. Time taken to load a page increase as number of request go beyond certain limit. Pms can proactively work on it if they understand the relationship between system variables and UX.

UX Design Aspects: Functional and Non-functional Requirements

There are two aspects of UX design: functional and non-functional. The functional aspect of an app encompasses its features and functions. For instance, when designing user login, offering a phone number login option becomes a functional requirement. Another functional requirement is sending an OTP when a user requests one during the sign-in process.

On the other hand, non-functional requirements (NFRs) define the quality aspects of the app. Let’s consider key aspects and features of an app like WhatsApp.

Performance: WhatsApp’s performance is gauged by message transfer time.
Scalability: The app’s ability to handle traffic. Insufficient scalability can result in crashes and slow performance.
Security: WhatsApp encrypts messages end-to-end for privacy.
Size and Memory: WhatsApp aims to be lightweight and memory-efficient for diverse devices.
Availability: The app should remain accessible around the clock, necessitating backup measures.
Reliability: WhatsApp aims for consistent and expected functionality.

These aspects apply to various apps and websites. Common NFRs include performance, scalability, security, availability, and reliability. Designing systems with these factors in mind ensures quality.

Availability

The most crucial NFR is availability. Vertical scaling (adding more power to a single server) and horizontal scaling (distributing requests across multiple servers) play roles. Vertical scaling has limitations, while horizontal scaling offers more scalability.

Response Time

Time and Geography: Response time is affected by geographical distance between client and server. Load balancers allocate requests based on location.
Load Balancing: Load balancers distribute requests across servers. They identify online machines, distribute tasks, and adjust server counts.
Load Balancing Algorithms: Methods like Round Robin and Least Response Time are used based on need.
Persistence: Load balancing algorithms consider persistent connections between client and server.
Backup: Passive load balancers step in if active ones fail.

Partitioning for Efficiency

Partitioning is used when handling large tables with billions of entries like in Facebook or WhatsApp databases. It optimizes querying.

Horizontal and Vertical Partitioning: Splitting rows is horizontal, columns is vertical. Vertical partitioning suits infrequently queried columns.

Partitioning Methods: Range-wise, list, round-robin, and hash partitioning are strategies.

Hash Partitioning: Hash functions assign rows based on fixed-length values.

Optimizing Database Operations: Partitioning optimizes read, search, and update operations. Consider user location for effective partitioning.

In larger systems, a combination of both horizontal and vertical partitioning is common for efficient data handling. Load balancing distributes traffic, considering data distribution across partitions.

Caching

To reduce traffic requests for decreasing computational power requirement we use caching.

Caching can significantly reduce the response time for serving HTTP data requests on websites and apps. Caching is based on the “locality of reference” principle, which suggests that recently requested data or items are likely to be requested again. Cache memory, which stores recently requested data, is a type of high-speed Random Access Memory (RAM) that provides quick data access but is costly.

When a new data request arises, the system checks if the data is available in the cache. If found, it’s a “cache hit”; if not, it’s a “cache miss.” In case of a cache miss, the request goes to the main database. Efficient cache management can increase cache hits and reduce cache misses.

Caching helps save resources by decreasing the number of requests to the database servers, lowering computational costs, and thus saving money. However, cache memory is not necessary for every purpose, and Solid State Drives (SSD) are more expensive than Hard Disk Drives (HDD), with cache memory being even more costly.

Cache memory finds applications in various layers of computing hardware, operating systems, applications, and more. Caching can be broadly categorized into client-side and server-side caching.

Client-Side Caching

Client-side caching is exemplified by the first time you request an HTML page. The browser fetches the page from the server, rendering it while also storing a copy on your PC. The next time you request the same page, the browser retrieves the cached copy, leading to quicker page loads. DNS caching is another type, which stores recent DNS lookups to enhance page loading speed.

Server-Side Caching

Server-side caching involves storing data on a cache server in the backend alongside the main database server. Subsequent requests can be served from the cache server rather than the main database, reducing latency and computational load.

Various caching strategies include:

Cache Aside: Application requests data from the cache; if not found, it fetches from the database and then caches it.
Read Through: Cache sits between application and database, updating itself with data from the DB.
Write Through: Application writes data to cache, then the cache writes to the database.
Write Behind/Back: Cache temporarily delays writing to the DB, periodically updating in bulk.
Scaling with Distributed Caches: Distributed caches like Memcached and Redis spread data across servers.

Cache eviction addresses limited storage capacity by removing old items. Strategies like “Least Frequently Used” or “First In, First Out” can be employed. Time-to-Live (TTL) sets a lifespan for cached files.

Content Delivery Networks (CDNs) efficiently serve large content like images and videos by distributing it across servers closer to end users, minimizing latency.

Availability and Robustness

Robust systems ensure data isn’t lost. Redundant servers store data copies, enabling failover. Active redundancy distributes load, while passive redundancy uses spare servers only when active ones fail.

Master-Slave Replication

Master-Slave setup ensures data availability. Active replication keeps servers in sync, while passive replication involves master forwarding updates to slaves.

Communication Protocols

HTTP Polling: Clients request data at fixed intervals, leading to possible empty responses.
HTTP Long Polling: Server waits for data until a response is received, reducing empty responses.
WebSockets: Persistent connection for two-way data exchange.
HTTP Server-Sent Events: One-way real-time data from server to client, suitable for read-heavy applications.

These concepts collectively enhance system performance, response time, and overall user experience.