KOF2005 - 2005/10/28 (Friday)

Just some memos from the Kansai Open Source 2005.

MySQL

Don't like their license. It's dual licensed "Quid Pro Quo", so in general using the Open Source community to do tests for their commercial service. Use it for development for free, but if you want to use it in your products - give us the money. In opposite the BSD licensed PostgreSQL.
MySQL have new service - MySQL Network ($2955/yr per server). Provide you with opimized binaries and 24/7 support.
MySQL is better in scalling-out (Several low-cost machines working together), then in scalling-up (SMP, expensive hardware). Good stuff. I like this approch. For us the most interesting was MySQL cluster. My question was: Is it ready for the enterprise and the answer was Boing is using it, which sounds mostly like yes. Good :)
Other interesting gems: the engine type (MyISAM, InnoDB, Memory) can be changed on per table base. Good. Indexes can be on separate disk/partition from the data. Good. We need to wait for MySQL 5.2 for better japanese support (full text search etc.) MySQL 5.0 finally have the big boys toys - views, triggers etc.

PostgreSQL

Using the install with default settings is bad. These setting are for the low-spec machines. What to optimize?

  • Change default settings
# shared buffers
shared_buffers=16000
# transaction log buffer (default is 8)
wal_buffers=32
# writer process (background process, handling durty pages)
bgwriter_delay=200
bgwriter_maxpages=4 (default is 100)
# checkpoint segments - reset transaction log and flush all durty pages to the disk)
checkpoint_segments=16 (default is 3)
  • Separate index and data (use create tablespace)
  • Good choice for file system is ext3. In fact ext2 also can do the job. Because we already have transaction log, there is no need of journaling.
/dev/sda1  /disk1  ext3  noatime  1  2
  • Use VACUUM: vacuumdb, pg_autovacuum

For benchmarking can be used the pgbench utility. pgbench -S for measuring throughput.

For me the PostgreSQL scaling seems more difficult then the MySQL one. Some pointers:

  • pgpool (used mostly in Japan) and Slony-I (outside of JP) can be used for replication
  • DRDB + Heartbeat for High Availability
  • Can be used very expansive shared SCSI disks.
  • PGCluster - formed from 5 machines. Provide HA.