How to Migrate Emails from Maildir to Gmail

Migration As previously reported, I have moved my main email from my own mail server to Gmail hosted on Google Apps. Signing up was easy. Moving DNS records was pretty straight-forward (a few clicks if your domain is with DreamHost). Getting IMAP up and running requires one simple setting inside Gmail, plus reconfiguring your MUA (Thunderbird, Outlook, etc).

The challenge for me though, is to move all my past emails from my email server (running Postfix + Dovecot) to Gmail. Although I am usually a “deleter” (rather than an archiver), I still kept some of my emails all the way from 1997. Over the years I have over 10,000+ emails sitting in Maildir format on my server, that somehow I need to move them to Gmail.

So I tried to connect to my Gmail account using Thunderbird + IMAP, and then manually drag all the emails over. That was a disaster. For example if I highlighted 100 emails, then drag ‘n’ drop them into “All Mails” under Gmail, and then the operation failed half way through (happens all the time) — I might end up having 50 random emails inside Gmail, but those were not deleted from my old account. That means I have to manually figure out which exact emails have been copied over — and that’s quite a tedious process. To make it safer, you just drag smaller batches over (say 10 at a time). Not a good idea if you have 10,000+ emails waiting to be moved.

Being a lazy programmer I thought the easiest way would be writing a small program that automates this. It would do one email at a time. If the operation failed, it will also know where to resume. The end result? A small Python script that’s conveniently named as maildir2gmail.py.

Download

Usage

This script basically works through all files in a directory, working out which ones are RFC822 email messages, and then push those files up to Gmail via an IMAP connection. It also remembers the file names that it had worked through so if the program somehow died (due to a bug for example), just restart it again. Well, this script is provided “as is” with no warranty. It works for me that migrated all my 10,000+ of old emails to Gmail, but YMMV.

To run it:

Usage: maildir2gmail.py [options] [maildirs]

Upload email messages from a list of Maildir to Google Mail.

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -f FOLDER, --folder=FOLDER
                        Folder to store the emails. [default: All Mail]
  -p PASSWORD, --password=PASSWORD
                        Password to log into Gmail
  -u USERNAME, --username=USERNAME
                        Username to log into Gmail

For example moving all my inbox to Gmail’s “All Mails”, and all my sent mails to Gmail’s “Sent Mails”:

$ python maildir2gmail.py -u username@gmail.com -p password ~/.maildir/cur
$ python maildir2gmail.py -u username@gmail.com -p password -f "Sent Mail" ~/.maildir/.Sent/cur

It will then print out which message it is working on. Go to sleep, and hopefully all messages will be migrated when you wake up in the morning :) On my old home server (AMD Duron 1GHz in Sydney), it took around 1-2 seconds per message. On a 64MB VPS I had with RapidXen in Fremont CA, it was doing around 2-3 messages per second.

Hopefully it would be helpful to some.

Category: Technology | Sat, 17 January 2009 2:02 pm

Comments

1.
Avatar for giulio
Posted by giulio on Wed, 4 February 2009 2:27 am

I tried the script on Ubuntu 8.10 (python 2.5). The script lists the messages and updates the database, but they do not show up in gmail. I have tried replacing “imap.gmail.com” with “imap.googlemail.com” (which is the address I use to connect to gmail) but nothing changes.

Is there any python module I should install (email is there)?

Kind regards,
Giulio


2.
Avatar for giulio
Posted by giulio on Wed, 4 February 2009 3:09 am

Sorted by replacing also
self.folder = ‘[Gmail]/%s’ % self.folder
with
self.folder = ‘[Google Mail]/%s’ % self.folder

Works a charm!

Many thanks
Giulio


3.
Avatar for scotty
Posted by scotty on Wed, 4 February 2009 5:07 pm

@giulio — yeah. Depending on your locale setting, Gmail actually name their mailboxes differently. For en.US it would be [Gmail], but for en.UK it would be [Google Mail]…

Maybe I’ll add this kind of detection into the code.


4.
Avatar for Joseph
Posted by Joseph on Wed, 25 February 2009 9:48 pm

Nice script, wish I had something like this when I migrated our family over. I used IMAP which is slooooow, took a week to get some accounts over.


5.
Avatar for Alexander Gieg
Posted by Alexander Gieg on Thu, 26 February 2009 5:01 am

This seems to be a nice workaround Thunderbird’s problems in correctly uploading messages to Gmail, specially when one uses the “ExportImportTools (Mboximport enhanced)” TB extension, which allows saving messages as individual .eml format. However, I’m finding some difficulties with this.

My guess is it’s a problem at Gmail itself, as since a few hours ago I’m unable to upload anything at all with TB (TB keeps saying it’s sending the password, again and again and again, which in fact is why I started searching around for a solution and found your script), but I’d like to be sure, so here’s what I get:

alexgieg@here:~$ ./maildir2gmail.py -u example@gmail.com -p example -f OLD ./test/cur
[15:49:00]: Sending “Test E-mail” (4543 bytes)
[15:49:02]: Connected to Gmail IMAP
[15:49:02]: Unable to send ./test/cur/Test E-mail.eml
Traceback (most recent call last):
File “./maildir2gmail.py”, line 167, in
main()
File “./maildir2gmail.py”, line 147, in main
gmail.append(filename)
File “./maildir2gmail.py”, line 63, in append
self.imap.append(self.folder, ‘(\\Seen)’, timestamp, content)
File “/usr/lib/python2.5/imaplib.py”, line 318, in append
return self._simple_command(name, mailbox, flags, date_time)
File “/usr/lib/python2.5/imaplib.py”, line 1055, in _simple_command
return self._command_complete(name, self._command(name, *args))
File “/usr/lib/python2.5/imaplib.py”, line 885, in _command_complete
typ, data = self._get_tagged_response(tag)
File “/usr/lib/python2.5/imaplib.py”, line 986, in _get_tagged_response
self._get_response()
File “/usr/lib/python2.5/imaplib.py”, line 903, in _get_response
resp = self._get_line()
File “/usr/lib/python2.5/imaplib.py”, line 996, in _get_line
line = self.readline()
File “/usr/lib/python2.5/imaplib.py”, line 1162, in readline
char = self.sslobj.read(1)
socket.error: (104, ‘Connection reset by peer’)
alexgieg@here:~$

I don’t know Python, so please tell me whether I can try something else, or if the only solution is to wait until whatever is wrong with Gmail’s IMAP right now gets solved.

Anyway, thanks for the script. It’ll be very useful once things start working again!


6.
Avatar for troy
Posted by troy on Sun, 8 March 2009 7:37 pm

1. Would you expect this to work with Thunderbird mailboxes?
2. Would you expect this to work on Windows?
3. Any idea why I might get this error message? (Sorry, I don’t know python yet)

File “maildir2gmail.py”, line 104
return u’ ‘.join(result)
^
SyntaxError: invalid syntax


7.
Avatar for troy
Posted by troy on Sun, 8 March 2009 7:38 pm

[ignore--just putting this here so i get followup comments via email]


8.
Avatar for scotty
Posted by scotty on Sun, 8 March 2009 8:43 pm

@troy — I do not know whether it works with Thunderbird mailboxes, however I do not think it would be too hard coding a solution for it. Working on Windows might take a bit of effort. Basically it’s not tested :)


9.
Avatar for Vahid Pazirandeh
Posted by Vahid Pazirandeh on Wed, 11 March 2009 3:22 am

Nice script (haven’t used it, but python is always sexy :)

Consider also using Google’s migration tool called Email Uploader: http://mail.google.com/mail/help/email_uploader.html

And more migration help from google: http://www.google.com/support/a/bin/answer.py?answer=61369


Avatar for Mr JM
Posted by Mr JM on Tue, 17 March 2009 8:54 am

I was contemplating moving to Gmail for a while, but did not want to give up 10 years of .maildir folder…

That python script just make me move over. Did few changes to it to suite my needs, wrapped a shells script around, and I am all converted (took 3 days thou).

Love it !!!!


Avatar for HellMind
Posted by HellMind on Wed, 25 March 2009 8:40 am

There is no error.
But my gmail is clean
What’s wrong
How can I debug this?


Avatar for HellMind
Posted by HellMind on Wed, 25 March 2009 9:04 am

[00:06:52]: Connected to Gmail IMAP
06:52.20 > CNPK2 APPEND “[Gmail]/All Mail” (\Seen) “11-Mar-2009 13:06:36 +0100″ {231839}
06:52.24 < + go ahead
06:52.24 write literal size 231839
06:52.84 CNPK3 LOGOUT
06:52.90 < * BYE LOGOUT Requested
06:52.90 BYE response: LOGOUT Requested
06:52.90 < CNPK3 OK 73 good day (Success)


Avatar for HellMind
Posted by HellMind on Wed, 25 March 2009 9:09 am

My fix :=> self.imap.append(‘INBOX’,


Avatar for HellMind
Posted by HellMind on Wed, 25 March 2009 9:17 am

How do I know if that append is a success :(


Avatar for Nick C
Posted by Nick C on Sun, 5 July 2009 3:35 am

This *mostly* worked – interestingly, the dates on a small minority of the uploaded emails weren’t preserved, and were instead dated with the date of the import rather than the date the email was sent. This only happens for a small percentage of the emails (say around 5%), but is still annoying enough for me to not use the script (others may be less bothered though.)

Still, thanks a lot for posting your solution online!


Avatar for Nick C
Posted by Nick C on Sun, 5 July 2009 5:35 am

as a follow-up, this is likely not due to your script, as I get the same error with a couple of other scripts I’ve tried, all of which are correctly parsing the date from the maildir files, and correctly setting the time when appending the message to the imap message list (yours does this too.) This always happens on the same messages, for all scripts, so I suspect the issue may somehow be at Google’s end (particularly given their warning about that problem likely occurring when uploading messages via imap.)


Avatar for Konstantin
Posted by Konstantin on Sun, 2 August 2009 3:32 am

Thank you for providing this extremely helpful script!!! It allows me to complete an important item that had been on my to-do list for three years: importing all or most of my pre-Gmail email.

The script occasionally aborts when there is an error in a specific message. In those cases, I simply delete the message in question and rerun the script. It then continues to upload the remaining messages. I have not run into any other problems.

All my old mail was meticulously organized in folders. Therefore, for each Gmail label, I perform a separate import of the corresponding maildirs. I then perform a complex query (using the before: operator, among others) to isolate the recently imported messages and label them. Then I append to the complex query a negation of that label.

The whole process is taking several days requiring infrequent intervention. But we are talking about ten years of mail taking up over 5GB.


Avatar for Tob
Posted by Tob on Sun, 30 August 2009 8:11 pm

Thanks for this very handy script!

A very easy fix for the IMAP folder GMail Labels correspondence problem described by Konstantin is to comment out lines 22 and 23, i.e.

# else:
# self.folder = ‘[Gmail]/%s’ % self.folder

This prevents the script from prepending the [Gmail] string to the folder name which makes the e-mail show up under the specified label instead.

I you have a look at Gmails IMAP tree using a mail client like thunderbird, you see that labels are folders directly at the root, while the “real” folders like “All Mail” are subfolders of [Gmail].


Add a comment

Gravatar is used. Email address is required but will not be displayed. Please keep your comment on topic. No spamming and/or bad language. First time poster will be moderated. Scott reserves the right to delete/edit your comments.