Brian Ray's Blog : postgreSQL/cluster_vs_replication.html

Painting is just another way of keeping a diary. --Picasso

Sun, 06 Feb 2005

Database Cluster VS Replication

Listening to people [1] coming back from Solutions Linux 2005 in Paris this week, I picked up some talk concerning database architecture. The long time buzz words, 'Cluster' and 'Replication' are so loose they are only understood by a fairly lengthy explanation to follow. With databases, I best understand these words when put into the master-slave context.

Clusters are multi-master

In MySQL DBAs spend much time talking about the MySQL Cluster. The goal here was to resemble ORAC, Oracle Real Application Cluster. The idea with Clusters is to create a synchronous master to master relationship between databases. This is much more ambitious and difficult than what first meets the eye.

There are many conflicting messages from MySQL regarding how they will make this work in the end. At SL 2005, a sales representative stated mySQL will support foreign keys right from the start. However, a senior software architect from MySQL AB in Sweden says in an email this will not happen.

The difficulty here, is what needs to happen in concerns of constant synchronization between master database. If one node becomes out of sync, the steps required to re-sync will far-outweigh the advantages gained by clustering in the first place.

Replication is master-slave

Again, "replication", is a loose term covering things such as shared storage, multi-system shared memory, or logical cluster and just about everything in between. In regards to mater-slave database design and replication, the slony [2] project working with PosgreSQL is a successful project mastering (pun intended) this concept.

Slony is an asynchronous replication system with one master, many slaves. A slave aware application (or one that uses pgpool, a connection pooler for PG), gives a database the speed advantage above Clusters through distributed queries. Writing to the replicated databases still need to be sent to the master, however.

Multi-Master-Multi-Slave

PostgreSQL development, although generally considered slower than others, has a history in delivering the feature rich and bullet proof database. Slony will probably take on the task of adding multi-masters to the project. When this happens, we will see a (better) Clustering system emerge from a proven replication system. In MySQL case, they will struggle with clustering without foreign key support.

I image PostgreSQL will add clustering someday to this project, itself. However, due to the enhancements already existing in the database, I foresee the demand for this to be somewhat lower than with other databases.

[1]Thanks specifically to #postgresql on freenode
[2]slon means elephant in Russian