Data Replication
Data replication creates multiple copies of data across locations to enhance availability, durability, and fault tolerance, using synchronous or asynchronous methods with varying consistency guarantees.
Data replication distributes data across multiple locations to reduce single points of failure and to improve read throughput and disaster recovery. It supports synchronous replication (updates all copies atomically in real time, ensuring strong consistency but higher latency), asynchronous replication (updates copies after commit, reducing write latency but allowing lag), and semi-synchronous replication (acknowledgment from a subset of replicas to balance latency and consistency). Common topologies include master-slave (primary-secondary), multi-master (active-active), and peer-to-peer. Key challenges include maintaining data consistency across replicas, resolving conflicts in multi-master setups, handling network latency, clock skew, and schema evolution, as well as data drift during upgrades. Techniques for consistency management include consensus protocols (e.g., Paxos, Raft), write-ahead logs, version vectors, and CRDTs for specific data types. When choosing a replication strategy, consider RPO/RTO targets, latency budgets, bandwidth, data locality, and regulatory requirements related to data replication, auditing, and retention.
graph LR
Center["Data Replication"]:::main
Rel_data_recovery["data-recovery"]:::related -.-> Center
click Rel_data_recovery "/terms/data-recovery"
Rel_data_obfuscation["data-obfuscation"]:::related -.-> Center
click Rel_data_obfuscation "/terms/data-obfuscation"
Rel_data_integrity["data-integrity"]:::related -.-> Center
click Rel_data_integrity "/terms/data-integrity"
classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
linkStyle default stroke:#4b5563,stroke-width:2px;
🧠 Knowledge Check
🧒 Explain Like I'm 5
Generated ELI5 content
🤓 Expert Deep Dive
Generated expert content
❓ Frequently Asked Questions
What is the purpose of data replication?
To increase availability, reliability, and fault tolerance by maintaining multiple copies of data in separate locations, enabling failover, disaster recovery, load distribution, and durability.
What are the main types of replication?
Synchronous (updates all replicas atomically), semi-synchronous (acknowledgment from some replicas), and asynchronous (updates copied later).
What is conflict resolution in replication?
In multi-master setups, concurrent writes may conflict; use last-writer-wins, version vectors, CRDTs, or application-specific strategies.
What is replication lag?
The delay between a write and its propagation to replicas, affecting read freshness.
What are common replication topologies?
Master-slave/primary-secondary, multi-master/active-active, and peer-to-peer.
Are CRDTs always suitable?
CRDTs help with certain data types and operations; not all data models benefit; require careful data design.