What's up with max_db_connections?

Thursday, December 5. 2013

Lately, I've been seeing a lot of people set the max_db_connections parameter to something very high. Don't do that!

Let me repeat: do not change max_db_connections unless you know what you are doing.

The long story: sometimes when you find that DBMail doesn't work as expected, you are tempted to turn a couple of knobs, change some parameters, see how that helps. For some reason the max_db_connections parameter looks too attractive not to touch.

This is mostly due to an apparently common misperception: max_db_connections is not about the number of concurrent clients you are able to handle. It's strictly about the number of concurrent database connections you are allowed to use. Libzdb is a very efficient database connection-pool. Connections in the pool are shared between all concurrently connected clients, as needed. Libzdb will tune-down the number of connections to the database if only a few are needed. No immediate risk of a DOS attack on your database there. But DBMail does create a pool of worker threads the size of - max_db_connections.

DBMail's IMAP server uses a pool of worker threads to handle communication with the database. IMAP often requires queries that take a non-trivial amount of time, and we don't want to block the main thread - which talks to your clients - while some query is in progress. So we offload that to worker threads. The size of the worker thread-pool is fixed so every possible database connection can operate in the context of it's own worker thread.

This means that when you set max_db_connections to say 100, you will create 100 threads, and something like 5 database connections. How work is distributed over the pool of threads is up to GLib, but most threads will be idle most - if not all - of the time. However, what this does to your CPU is potentially quite bad. Just google for context-switching, or run 'vmstat 1' on your DBMail machine.

The only situation where you want to increase the max_db_connections parameter is when you see that all database connections in the libzdb pool are busy most of the time. An obvious symptom is when you see lots of

Thread is having trouble obtaining a database connection. Try [1]

especially when the value between brackets is higher than 1

Migrating from MySQL to PostgreSQL

Thursday, September 13. 2012

When you come, like me, from a LAMP background, using MySQL as backend for DBMail seems natural.

As your database grows, little things at the edge of your mind start nagging you, or you've grown to like and appreciate PostgreSQL's more mature feature set. Or maybe you just dis-like Oracle.

Migrating from MySQL to PostgreSQL is no simple feat. Typically, MySQL is very lax when it comes to accepting encoded strings. And this will bite you when you try to load a SQL dump from MySQL into PostgreSQL. And bite you hard it will. Especially if your DBMail database, like mine, dates back many years.

I started working on some migration script that lives in contrib/sql2sql/mysql2pgsql.sh. It's old and unfinished, because last time I tried to migrate my main dog-food installation, I failed.

But those little things kept nagging at me, so I started anew. And this time I was successful.

The mysql2pgsql.sh script will now simply work, though you will have to install a pre-requisite (py-mysql2pgsql), and edit the yaml file included in the .../sql2sql/ directory.

You will also want to shutdown your email services, because the export/import is not atomic.

happy migration!

Pruning the dbmail_headervalue table

Saturday, February 13. 2010

In DBMail 2.2 the dbmail_headervalue table will contain all header-values for all messages in store. Even for medium-sized installations this can easily result in a very large table.

However, since that table is just about only used for IMAP search you might consider dropping a couple of headernames from the cache.

For example:

DELETE FROM dbmail_headername WHERE headername = 'received';
will delete from dbmail_headervalue all entries regarding the Received header.

To determine which headernames should be dropped you could use a view like this:

CREATE VIEW header_count AS 
    SELECT count(1) AS count, n.id, n.headername 
    FROM dbmail_headervalue v 
    LEFT JOIN dbmail_headername n ON v.headername_id=n.id 
    GROUP BY n.id;

After that you can do:

SELECT * FROM header_count ORDER BY count; 
and delete all headernames from dbmail_headername for those headers you deem unlikely to ever be used in IMAP search.
DELETE FROM dbmail_headername WHERE headername = 'Received';

Doing this for a couple of the most prolific - but un-used headers - will drastically reduce the size of the dbmail_headervalue tables. But remember you will have to keep an eye on the header_count, and re-issue the delete queries on a regular basis.

DBMail on twitter

Tuesday, January 19. 2010

I've added a post-receive hook to my GIT repository at git.dbmail.eu so you can stay in touch with dbmail changes via twitter.

You can follow me at http://twitter.com/pjstevns.

The script I'm using is really simple. Some others might also find this useful:

#!/bin/sh

# copyright Paul Stevens, 2010, paul@nfg.nl
# licence GPLv2
#
# example hook script to send out twitter messages.
# This script will send out messages summarizing new revisions
# introduced by the change received
#
#
# Config
# ------
# hooks.twitterid
#   the username on twitter
# hooks.twitterpw
#   the password on twitter
# hooks.hashtag
#   insert a hashtag at the start of the message
# hooks.hashurl
#   replace hash signs in commit messages with an url. This
#   is used to link hash ids in messages to a bugtracker since
#   they typically refer to a bug-id.
#


GIT_DIR=$(git rev-parse --git-dir 2>/dev/null)
if [ -z "$GIT_DIR" ]; then
	echo >&2 "fatal: post-receive: GIT_DIR not set"
	exit 1
fi
projectdesc=$(sed -ne '1p' "$GIT_DIR/description")
# Check if the description is unchanged from it's default, and shorten it to a
# more manageable length if it is
if expr "$projectdesc" : "Unnamed repository.*$" >/dev/null
	then
	projectdesc="UNNAMED PROJECT"
fi


generate_message()
{
	oldrev=$(git rev-parse $1)
        newrev=$(git rev-parse $2)
	refname="$3"
	message=`git log --pretty=oneline ${oldrev}..${newrev} $refname|cut -f2- -d' '|sed 's/$/, /g'`
	hashurl=`echo "$hashurl"|sed 's/?/?/'`
	if [ -n "$hashtag" ]; then
		echo -n "#${hashtag} "
	fi
	echo $message|sed -e "s,#,${hashurl},g" -e 's/,$//'
}		

send_twitter()
{
	message="$@"
	curl --basic --user ${twitterid}:${twitterpw} --data status="$message" 
		https://twitter.com/statuses/update.xml >/dev/null
}




twitterid=$(git repo-config hooks.twitterid)
twitterpw=$(git repo-config hooks.twitterpw)
hashurl=$(git repo-config hooks.hashurl)
hashtag=$(git repo-config hooks.hashtag)

# --- Main loop
# Allow dual mode: run from the command line just like the update hook, or if
# no arguments are given then run as a hook script
if [ -n "$1" -a -n "$2" -a -n "$3" ]; then
        # Output to the terminal in command line mode - if someone wanted to
        # resend an email; they could redirect the output to sendmail themselves
        generate_message $1 $2 $3
else
        while read oldrev newrev refname
        do
                message=`generate_message $oldrev $newrev $refname`
		send_twitter "$message"
        done
fi

re-theming dbmail.org

Thursday, January 7. 2010

The look and functionality of dbmail.org has been a neglected steph-child of abandonment for too long. So I decided to spent some time revising and redesigning.

The major goal was offering a more professional impression to first time visitors, and providing a better insight into the dynamic of the project as a whole. This is why I made it a portal-style site, showing several small blocks of recent activity concerning code-changes, news, and yes blog entries like this one.

I'm not a graphics designer, but I still hope the designers among you won't be too offended by the lack of gloss and finish. Still, I did utilize one small trick by applying another project I've been buzy working on http://webfonts.biz - a font-service.

Hope you like it.

A RESTful interface is coming

Friday, July 3. 2009

Again and again, dbmail administrators ask us about the best way for their application to talk to dbmail. And our answer has always been the same: talk IMAP.

However, IMAP is a notorious protocol for many. Not because it is inherently evil, but because creating a parser for imap formatted data is not a trivial thing. It's simply a lot of work, and easy to get wrong. It's also top-heavy; a lot of the capabilities in IMAP are simply not needed at all by your typical webapp developer.

So the question was: how can we design a solution that will allow an application developer to quickly integrate with dbmail, using standard tools and languages.

Being forward compatible could be considered a secondary, but no less important requirement. The chosen approach - short of IMAP - often comes as custom build queries talking directly with the database backend. Few if any who walked this path, spent any effort on supporting database servers other than the one they use. This does not make for long-term viability, and does not support the community at large. Also, dbmail's schema has and will - quite possibly - change. Retrieving a message from the database was simple, but is no longer. Finally, not being able to leverage the scalability features included in the recent releases means you have to worry about a denial of service on your database server when the number of visitors of your webapp increases.

So enters a solution: dbmail-httpd. A simple event-driven daemon that will expose the object model of dbmail through a RESTful interface. A php5 userland module will also be provided both to test the interface and to demonstrate the power of this approach. Where possible, requested data will be returned as JSON. This makes parsing completely trivial of course.

It is our hope and expectation that this will drive development for a new class of webmail interfaces built on top of dbmail, but also enable construction of a new - and native - administration interface that will not require additional hosting sit-ups.

so stay tuned for more.

DBMail 2.3.6 released

Friday, July 3. 2009

I've just released dbmail 2.3.6, the latest development release.

There are still some rough edges in the packaging and documentation, but otherwise the code is approaching productionlevel quality.

New features in this release:

Single-instance header storage

The header caching tables used since 2.2 have been replaced with a new schema, optimized for a much smaller storage footprint, and therefor faster access. Headers are now cached using a single-instance storage pattern, similar to the one used for the message parts. This change also introduces for the first time the appearance of views in the database, which is somewhat experimental because of some uncertainties with regard to the possible performance impact this may have.

Authentication logging

A new table was added to the schema to log a couple of key metrics for users connecting to one of the daemons.

Storage migration

dbmail-util now supports migrating your old content into the single-instance storage.

Of course, a number of bugs have also been fixed along the way:

  • 0000689: [Command-Line programs (dbmail-users, dbmail-util)] dbmail-exports fails with File size limit exceeded (paul) - resolved.
  • 0000775: [PIPE delivery (dbmail-smtp/dbmail-deliver)] Issue with multiple inline attachments (paul) - resolved.
  • 0000783: [General] Boundary missing in message construction (paul) - resolved.
  • 0000681: [General] message reconstruction fails on message (paul) - resolved.
  • 0000774: [IMAP daemon] SQLException using dbmail-imapd - resolved.
  • 0000766: [POP3 daemon] dbmail-pop3d crash (paul) - resolved.
  • 0000754: [General] single instance storage for headervalues (paul) - resolved.
  • 0000760: [LMTP daemon] DNS Regresion in 2.3.5 (netvulture) - resolved.
  • 0000743: [LMTP daemon] Memory leak in lmtpd (paul) - resolved.
  • 0000755: [POP3 daemon] POP3D crash when fetchmail tries to connect (paul) - resolved.
  • 0000720: [Command-Line programs (dbmail-users, dbmail-util)] Missing operations on dbmail-util (paul) - resolved.

Changelog

Download

DBMail 2.3.4 released

Friday, November 14. 2008

It is with great pleasure that I'm announcing the availability of DBMail version 2.3.4, the latest in the 'unstable' development series.

The main focus of this release has been stability. I hope and expect this version to mark the final milestone before 2.4.0.

Special thanks to Jonathan Feally who's help in fixing bugs and adding features was invaluable.

The only new feature that deserves special attention is the new fine-grained logging mechanism written by Jon.

Also, IMAP-IDLE works again without any problems.

Changelog

Download

happy testing

DBMail 2.2.11 second release candidate

Tuesday, October 7. 2008

I've finally been able to get back into the rhythm. The buildup towards 2.4 is progressing nicely, and I held a small bug-squasher for 2.2.

So here it is: dbmail-2.2.11 second release candidate; way overdue - sorry about that.

Things changed since 2.2.10

  • 0000731: [Documentation] Missing documentation of database layer logging control (paul)
  • 0000723: [Database layer] simultaneous mailbox creation (paul)
  • 0000709: [Database layer] Some sql optimizations (paul)
  • 0000725: [IMAP daemon] Fix Thunderbird and ACL shared folders (paul)
  • 0000721: [Authentication layer] mail quota in ldap not used during delivery (paul)
  • 0000698: [IMAP daemon] PostgreSQL 8.3.1 can't execute query (paul)
  • 0000712: [General] traces to stderr may cause core dumps if hostname >=16 (paul)
  • 0000710: [IMAP daemon] eliminate annoying "[Illegal seek] on read-stream" message from imap4d
  • 0000704: [IMAP daemon] IMAP TEXT searches stop at headers
  • 0000670: [IMAP daemon] IMAP TEXT searches only seem to search headers (paul)

Code as a Rhythm

Wednesday, July 2. 2008

Changes don't happen overnight. Or at least, that's not the end of it.

Working on big invasive changes in production code poses it's own challenges. But after going through the motions over a couple of major releases I'm learning that even though the actual workload underhand maybe breaking new terrain in my personal corner of the world, whether it'd be learning about C, imap, mime or warping my head around threads and event-driven designs - the kind of work, effort or mental labour if you will, continues to be much the same all along.

It's not just about 'sticking with it' - though persistency does help alot. It's also about feeling the rhythm, and respecting it. Just don't rush it, keep the pace, feel the beat, if you know what I mean.

There are bigger cycles, with smaller ones within. Changes are steps, bound in a pattern all of their own. From a prototype 'proof of concenpt' change, to 'pattern established: implement' change, to a consistency change wrapping it up. Consistency changes demonstrate and finalize one of those cycles. They clean-up and codify a new element of elegance, beauty if you will. It's when the code underhand starts to smell right again, and any bugs are easily tracked down and fixed with minimal effort. Until of course, the downhill ride ends where the next hill rises to the horizon...

DBMail 2.3.3 released

Monday, June 2. 2008

DBMail just received a huge performance boost: version 2.3.3 released today features a shiny new networking/database core.

The new shared database connection pool drastically reduces the number of database connections (and backend network sockets) required to serve large amounts of concurrent frontend users.

The frontend itself, meanwhile, has been rewritten as an asynchronous event-driven process.

Combined, these changes provide solid fundamentals for a future 2.4.x release series focused on performance and scalability.


Continue reading "DBMail 2.3.3 released"

Multifoo lift-off

Friday, May 23. 2008

Today I've reached a major milestone in the multifoo rewrite.

For those of you who don't know what I mean by multifoo; It's a term that was coined (afaik) by Aaron Stone to describe a server design for dbmail with multi-plexed asynchronous network IO and multi-threaded command processing.

Last week I found a pattern to make all IMAP commands run in the thread-pool to keep them from blocking the main thread. But somehow I kept being bitten by heisenbugs which made me have to reconsider. Last night however I had an epiphany; there were still some calls used while running in the threadpool that send data over the network. The bug was quickly fixed today, and my design saved. Imaptest now doesn't generate anymore errors, and basic message browsing works using thunderbird.

I can now start testing more fat clients such as thunderbird and outlook against what should become a release candidate for 2.3.3. Looking very good indeed.

Interfacing with the DBMail database

Tuesday, May 20. 2008

A typical question that pops up now and then is about direct database access:

I'm looking for an application to help me save emails to a Database. I read about your email solution, DBMail, and it looks really good. I already have a mail server I'm using for my webmail, but my question is whether it would be possible to setup logging of emails to a database using my current mail server and DBMail.

DBMail uses a database to store its messages. Currently PostgreSQL, MySQL and Sqlite are supported. The intent of the database backend is to provide speed, scalability and integrity in storage. The database backend is not especially suited for direct access. The database schema is heavily normalized and contains numerous indexes and caching tables for speed, as well as trigger logic to ensure data integrity.

It's best to let DBMail manage the database contents and do message storage and retrieval through the appropriate mail protocols (LMTP for storage, POP3 or IMAP4 for retrieval). An additional advantage of this approach is, you can swap in or out any mail server under your webmail scripts layer. These protocols are widely used and well understood.

Welcome to my dbmail experience

Thursday, May 15. 2008

Here I will come to share what it means for me to work on this little open-source project called dbmail. Hopefully it will be as much of a learning experience to write about it, as it is to work on it.

But first, a disclaimer; Whatever I tell you, don't take my word for it! Seek confirmation from those you trust. If you want to talk about a specific problem, seek the best specialist you can find.

Dbmail, for me, is about writing code, enjoying the act of creation, learning to love C, dealing with matters of consistency, assurance, and trust. It's about getting it right. Making mistakes. And being honest about both.

Communities of trust don't just happen. They take a lot of effort, affection and rejection. Trust yourself as well as others. Where fear or lack of confidence do not deflect you from pursuing your vision, experience will be your teacher. Perseverance will take us halfway any task. Stick to it, and enjoy the ride.