Martin Kleppmann :: Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Martin Kleppmann :: Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

𝕟𝕠𝕥𝕖𝕤^.justin.vc

NOTES TAGS CATEGORIES RECENT

❯

Martin Kleppmann :: Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Justin

4 min read

oreilly
software-architecture
software-engineering
book

Links

1. Designing Data-Intensive Applications
2. Part I. Foundations of Data Systems
3. Chapter 1. Reliable, Scalable, and Maintainable Applications
4. Chapter 2. Data Models and Query Languages
5. Chapter 3. Storage and Retrieval
6. Chapter 4. Encoding and Evolution
7. Part II. Distributed Data
8. Chapter 5. Replication
9. Chapter 6. Partitioning
10. Chapter 7. Transactions
11. Chapter 8. The Trouble with Distributed Systems
12. Chapter 9. Consistency and Consensus
13. Part III. Derived Data
14. Chapter 10. Batch Processing
15. Chapter 11. Stream Processing
16. Chapter 12. The Future of Data Systems

Apparently this is getting a second edition soon(?) - so will wait off on doing notes for the first edition.

Designing Data-Intensive Applications

Part I. Foundations of Data Systems

Chapter 1. Reliable, Scalable, and Maintainable Applications

Thinking About Data Systems

Reliability

Hardware Faults

Software Errors

Human Errors

How Important Is Reliability?

Scalability

Describing Load

Describing Performance

Approaches for Coping with Load

Maintainability

Operability: Making Life Easy for Operations

Simplicity: Managing Complexity

Evolvability: Making Change Easy

Summary

Chapter 2. Data Models and Query Languages

Relational Model Versus Document Model

The Birth of NoSQL

The Object-Relational Mismatch

Many-to-One and Many-to-Many Relationships

Are Document Databases Repeating History?

Relational Versus Document Databases Today

Query Languages for Data

Declarative Queries on the Web

MapReduce Querying

Graph-Like Data Models

Property Graphs

The Cypher Query Language

Graph Queries in SQL

Triple-Stores and SPARQL

The Foundation: Datalog

Summary

Chapter 3. Storage and Retrieval

Data Structures That Power Your Database

Hash Indexes

SSTables and LSM-Trees

B-Trees

Comparing B-Trees and LSM-Trees

Other Indexing Structures

Transaction Processing or Analytics?

Data Warehousing

Stars and Snowflakes: Schemas for Analytics

Column-Oriented Storage

Column Compression

Sort Order in Column Storage

Writing to Column-Oriented Storage

Aggregation: Data Cubes and Materialized Views

Summary

Chapter 4. Encoding and Evolution

Formats for Encoding Data

Language-Specific Formats

JSON, XML, and Binary Variants

Thrift and Protocol Buffers

Avro

The Merits of Schemas

Modes of Dataflow

Dataflow Through Databases

Dataflow Through Services: REST and RPC

Message-Passing Dataflow

Summary

Part II. Distributed Data

Chapter 5. Replication

Leaders and Followers

Synchronous Versus Asynchronous Replication

Setting Up New Followers

Handling Node Outages

Implementation of Replication Logs

Problems with Replication Lag

Reading Your Own Writes

Monotonic Reads

Consistent Prefix Reads

Solutions for Replication Lag

Multi-Leader Replication

Use Cases for Multi-Leader Replication

Handling Write Conflicts

Multi-Leader Replication Topologies

Leaderless Replication

Writing to the Database When a Node Is Down

Limitations of Quorum Consistency

Sloppy Quorums and Hinted Handoff

Detecting Concurrent Writes

Summary

Chapter 6. Partitioning

Partitioning and Replication

Partitioning of Key-Value Data

Partitioning by Key Range

Partitioning by Hash of Key

Skewed Workloads and Relieving Hot Spots

Partitioning and Secondary Indexes

Partitioning Secondary Indexes by Document

Partitioning Secondary Indexes by Term

Rebalancing Partitions

Strategies for Rebalancing

Operations: Automatic or Manual Rebalancing

Request Routing

Parallel Query Execution

Summary

Chapter 7. Transactions

The Slippery Concept of a Transaction

The Meaning of ACID

Single-Object and Multi-Object Operations

Weak Isolation Levels

Read Committed

Snapshot Isolation and Repeatable Read

Preventing Lost Updates

Write Skew and Phantoms

Serializability

Actual Serial Execution

Two-Phase Locking (2PL)

Serializable Snapshot Isolation (SSI)

Summary

Chapter 8. The Trouble with Distributed Systems

Faults and Partial Failures

Cloud Computing and Supercomputing

Unreliable Networks

Network Faults in Practice

Detecting Faults

Timeouts and Unbounded Delays

Synchronous Versus Asynchronous Networks

Unreliable Clocks

Monotonic Versus Time-of-Day Clocks

Clock Synchronization and Accuracy

Relying on Synchronized Clocks

Process Pauses

Knowledge, Truth, and Lies

The Truth Is Defined by the Majority

Byzantine Faults

System Model and Reality

Summary

Chapter 9. Consistency and Consensus

Consistency Guarantees

Linearizability

What Makes a System Linearizable?

Relying on Linearizability

Implementing Linearizable Systems

The Cost of Linearizability

Ordering Guarantees

Ordering and Causality

Sequence Number Ordering

Total Order Broadcast

Distributed Transactions and Consensus

Atomic Commit and Two-Phase Commit (2PC)

Distributed Transactions in Practice

Fault-Tolerant Consensus

Membership and Coordination Services

Summary

Part III. Derived Data

Chapter 10. Batch Processing

Batch Processing with Unix Tools

Simple Log Analysis

The Unix Philosophy

MapReduce and Distributed Filesystems

MapReduce Job Execution

Reduce-Side Joins and Grouping

Map-Side Joins

The Output of Batch Workflows

Comparing Hadoop to Distributed Databases

Beyond MapReduce

Materialization of Intermediate State

Graphs and Iterative Processing

High-Level APIs and Languages

Summary

Chapter 11. Stream Processing

Transmitting Event Streams

Messaging Systems

Partitioned Logs

Databases and Streams

Keeping Systems in Sync

Change Data Capture

Event Sourcing

State, Streams, and Immutability

Processing Streams

Uses of Stream Processing

Reasoning About Time

Stream Joins

Fault Tolerance

Summary

Chapter 12. The Future of Data Systems

Data Integration

Combining Specialized Tools by Deriving Data

Batch and Stream Processing

Unbundling Databases

Composing Data Storage Technologies

Designing Applications Around Dataflow

Observing Derived State

Aiming for Correctness

The End-to-End Argument for Databases

Enforcing Constraints

Timeliness and Integrity

Trust, but Verify

Doing the Right Thing

Predictive Analytics

Privacy and Tracking

Summary

Backlinks

O'reilly Media

Search

K

Created with quartz^4.4.0 © 2025 & Emacs 29.4 (Org mode 9.8 + ox-hugo) by Justin Malloy © 2023-2025

All original content is licensed under a free/libre copyleft license (GPL or CC BY-SA). Read the notice about the license and resources.