Migrating from MySQL to PostgreSQL

Thursday, September 13. 2012

When you come, like me, from a LAMP background, using MySQL as backend for DBMail seems natural.

As your database grows, little things at the edge of your mind start nagging you, or you've grown to like and appreciate PostgreSQL's more mature feature set. Or maybe you just dis-like Oracle.

Migrating from MySQL to PostgreSQL is no simple feat. Typically, MySQL is very lax when it comes to accepting encoded strings. And this will bite you when you try to load a SQL dump from MySQL into PostgreSQL. And bite you hard it will. Especially if your DBMail database, like mine, dates back many years.

I started working on some migration script that lives in contrib/sql2sql/mysql2pgsql.sh. It's old and unfinished, because last time I tried to migrate my main dog-food installation, I failed.

But those little things kept nagging at me, so I started anew. And this time I was successful.

The mysql2pgsql.sh script will now simply work, though you will have to install a pre-requisite (py-mysql2pgsql), and edit the yaml file included in the .../sql2sql/ directory.

You will also want to shutdown your email services, because the export/import is not atomic.

happy migration!

Pruning the dbmail_headervalue table

Saturday, February 13. 2010

In DBMail 2.2 the dbmail_headervalue table will contain all header-values for all messages in store. Even for medium-sized installations this can easily result in a very large table.

However, since that table is just about only used for IMAP search you might consider dropping a couple of headernames from the cache.

For example:

DELETE FROM dbmail_headername WHERE headername = 'received';
will delete from dbmail_headervalue all entries regarding the Received header.

To determine which headernames should be dropped you could use a view like this:

CREATE VIEW header_count AS 
    SELECT count(1) AS count, n.id, n.headername 
    FROM dbmail_headervalue v 
    LEFT JOIN dbmail_headername n ON v.headername_id=n.id 
    GROUP BY n.id;

After that you can do:

SELECT * FROM header_count ORDER BY count; 
and delete all headernames from dbmail_headername for those headers you deem unlikely to ever be used in IMAP search.
DELETE FROM dbmail_headername WHERE headername = 'Received';

Doing this for a couple of the most prolific - but un-used headers - will drastically reduce the size of the dbmail_headervalue tables. But remember you will have to keep an eye on the header_count, and re-issue the delete queries on a regular basis.

A RESTful interface is coming

Friday, July 3. 2009

Again and again, dbmail administrators ask us about the best way for their application to talk to dbmail. And our answer has always been the same: talk IMAP.

However, IMAP is a notorious protocol for many. Not because it is inherently evil, but because creating a parser for imap formatted data is not a trivial thing. It's simply a lot of work, and easy to get wrong. It's also top-heavy; a lot of the capabilities in IMAP are simply not needed at all by your typical webapp developer.

So the question was: how can we design a solution that will allow an application developer to quickly integrate with dbmail, using standard tools and languages.

Being forward compatible could be considered a secondary, but no less important requirement. The chosen approach - short of IMAP - often comes as custom build queries talking directly with the database backend. Few if any who walked this path, spent any effort on supporting database servers other than the one they use. This does not make for long-term viability, and does not support the community at large. Also, dbmail's schema has and will - quite possibly - change. Retrieving a message from the database was simple, but is no longer. Finally, not being able to leverage the scalability features included in the recent releases means you have to worry about a denial of service on your database server when the number of visitors of your webapp increases.

So enters a solution: dbmail-httpd. A simple event-driven daemon that will expose the object model of dbmail through a RESTful interface. A php5 userland module will also be provided both to test the interface and to demonstrate the power of this approach. Where possible, requested data will be returned as JSON. This makes parsing completely trivial of course.

It is our hope and expectation that this will drive development for a new class of webmail interfaces built on top of dbmail, but also enable construction of a new - and native - administration interface that will not require additional hosting sit-ups.

so stay tuned for more.

Interfacing with the DBMail database

Tuesday, May 20. 2008

A typical question that pops up now and then is about direct database access:

I'm looking for an application to help me save emails to a Database. I read about your email solution, DBMail, and it looks really good. I already have a mail server I'm using for my webmail, but my question is whether it would be possible to setup logging of emails to a database using my current mail server and DBMail.

DBMail uses a database to store its messages. Currently PostgreSQL, MySQL and Sqlite are supported. The intent of the database backend is to provide speed, scalability and integrity in storage. The database backend is not especially suited for direct access. The database schema is heavily normalized and contains numerous indexes and caching tables for speed, as well as trigger logic to ensure data integrity.

It's best to let DBMail manage the database contents and do message storage and retrieval through the appropriate mail protocols (LMTP for storage, POP3 or IMAP4 for retrieval). An additional advantage of this approach is, you can swap in or out any mail server under your webmail scripts layer. These protocols are widely used and well understood.