Analyzing Exchange and mbox e-mail files using Free and Open Source Software

 

 

Mike Harrington, CFCE EnCE

mailto: linuxchimp@gmail.com

Innovative Digital Forensic Solutions, L.L.C.

 

Mark Lachniet, CISA CISSP

mailto: mlachniet@analysts.com

Analysts International

 







1.   Document Overview

 

E-mail is everywhere and the digital forensic examiner is often faced with the task of searching e-mail for evidence of wrongdoing. This paper attempts to outline a simple methodology for using free and open source based tools for converting Microsoft Outlook or Outlook Express files into a flat mbox format that can be then manually imported into the Mozilla Thunderbird e-mail client for viewing, or manipulated using other useful scripts.  This document is really just a primer for basic e-mail analysis, and is intended to be a living document.  If you have any questions, comments or suggestions (including sections that you think should be added!) please contact the authors directly

 

The paper will be divided into several sections. Section two details installing Libpst and Libdbx to convert the outlook and Outlook Express files. Section three deals with finding the .dbx or .pst e-mail files.  Section four details converting the found .dbx or .pst files into the flat mbox format using the readdbx or readpst tools that were compiled in Section One. The fifth section covers how to import these converted files into the Mozilla Thunderbird e-mail client for viewing.  The sixth section will discuss how to parse mbox files into threaded HTML documents and extract attachments for easy searching and manipulation.  The last section will discuss other useful tools and tricks that could be of use to the examiner.

 

Throughout this paper the examples we will be using are based on my forensic laptop that is an AMD64 machine running Gentoo an x86_64 2.6.12 kernel 1. The examples should work exactly the same for x86 based machines or other UNIX-type systems in general.

 

2.   LIBPST/LIBDBX

 

The readbx and readpst executables are created from the Libdbx and Libpst source code respectively.  You can find the source for both at the following site.

 

http://sourceforge.net/project/showfiles.php?group_id=18756&release_id=117314

 

(Of course, using Gentoo one only needs to use the commands 'emerge libdbx' or 'emerge libpst'...;-)

 

Once you've downloaded the source to a download location of your choice (in this case I've downloaded the source to '/usr/local/forensicapps') you need to untar and unzip the archives.

 

chimp forensicapps# tar xvzf libdbx_1.0.3.tgz

chimp forensicapps# tar xvzf libpst_0.3.4.tgz

 

Then change into the directory for libdbx.

 

chimp forensicapps# cd libdbx_1.0.3

chimp libdbx_1.0.3# make

 

You should now have a file called readbx in this directory. Make sure its executable by issuing the following command

 

chimp libdbx_1.0.3# chmod +x readbx

 

Now move the executable to a directory in your path usch as /usr/local/bin.

 

chimp libdbx_1.0.3# mv ./readbx /usr/local/bin

 

Repeat the following steps for untaring/zipping and compiling readpst.  You will then have file named readpst that you can then make executable by the same method described above.  Also move this into a directory in your path.

 

That's it! You can now move onto the next section which details the .dbx and .pst files that you want to convert.

 

3.   Locating Exchange .dbx/.pst Files

 

The next required step is to find the e-mail files (mailboxes) that you want to analyze.  To do this, you can either find a copy on the client workstation, or export them from an Exchange server.

 

3.2 Locating files in the filesystem

 

3.2.1 Deleted Files

 

First of all, you should determine whether or not there may be copies of e-mail DBX and PST files in the deleted and slack portions of the file system.  You may wish to use an automated forensic program such as SMART (http://www.asrdata.com/tools/) to see if it is possible to recover any older, deleted files.  SMART can also be used to extract pure unallocated data for you to concentrate on exclusively. Remember deleted files may contain the “smoking gun” you are looking for!

 

The way we'll cover here is by using the Foremost carving tool (http://foremost.sourceforge.net). Since we cant assume that everyone is using a distro with a decent package management tool (Gentoo anyone?) lets grab the source and compile it ourselves (remember to check the md5sum of the download).

 

 

 

Now with the source downloaded let's extract it (I've downloaded the source to my temp directory.

 

chimp temp# tar xvzf foremost-069.tar.gz

chimp temp# cd foremost-069

chimp temp# cat README | less

chimp temp# make && make install

 

This will extract the gzipped tar archive and then reading the README file will tell you about how to compile and install ('make && make install'). One thing to note is that the foremost.conf file that contains the header and footer information for the file types you want to carve needs to be in the directory you run foremost from.

 

Take a peek inside the foremost.conf file to see how its formatted and what types of files are already supported. For our purposes simply open up foremost.conf in a text editor and uncomment (erase the '#’ that begin before a line) the .dbx (or .mbx,.pst) line. They are located in the Microsoft Office section.

 

chimp bin# nano -w foremost.conf

 


Now with that done you need to run foremost over your image files. Foremost requires an empty directory to dump files it finds.  It also keeps an audit of the files it finds and the offset in the image file where they were found.

 

 

 

What if you have multiple image segments? No worries mate! One of the cool things that foremost can do is create output directories on the fly…so let's just write a script to take care of our multiple segments.

 

First make the initial output directory (you could script this as well..;-))

 

chimp evid# mkdir carvdbx

 

Now the script:

 

#!/bin/bash

x=0

# the above sets a counter

for i in /your/image/dir/

#This loops through your segments

do

foremost -v $i -o /your/output/dir$x

#this carves with verbose output turned on and outputs to your dir

x = 'expr $x + 1'

#this increments the value of 'x' by one

done

 

With the files carved proceed on...

 

3.2.2 Allocated Files

 

The most common location for .dbx files to be located is in the following path (on a Windows XP box).

 

C:\Documents and Settings\<User>\Local Settings\Application Data\Identities\{GUID}\Microsoft\Outlook Express

 

Common .dbx files you might see in this location might include Inbox.dbx, Sent Items.dbx and Drafts.dbx. There might be others as well. Simply copy these files out to a directory on your mounted forensic drive (in my example my suspect NTFS partition is mounted read only at '/mnt/win').

 

chimp ~# cp /mnt/win/”Documents and Settings”/$USER/”Local Settings”/”Application Data”/Identities/{GUID}/Microsoft/'Outlook Express”/*.dbx  /mnt/evidence/e-mail/dbx/

 

If you want to make sure your not missing any .dbx files you can use the find command to locate the .dbx files and copy them over to your forensic directory.

 

chimp ~# find /mnt/win -type f -name “*.dbx” -print -exec cp '{}' /mnt/evidence/e-mail/dbx \;

 

Passing the '-print' parameter to the find command gives you a nice output of what is being found and copied over.  Omit this to suppress the output.

 

The procedure for finding .pst file is exactly the same. The default location on a Windows XP box for .pst file is in the following path.

 

C:\Documents and Settings\$USER\Local Settings\Application Data\Microsoft\Outlook\

 

Got all that? Good.  Now we can progress onto the next section where we detail how to convert our newly found files into a flat mbox format that will be easily imported into the Thunderbird e-mail client.

 

3.3 Exporting from Exchange

 

In the event that you don’t have access to a user’s workstation, but do have administrator access to the Exchange server, you may be able to export a user’s data to a PST file using the ExMerge program.  To download this file, refer to:

 

http://www.microsoft.com/downloads/details.aspx?displaylang=en&familyid=429163ec-dcdf-47dc-96da-1c12d67327d5

 

According to the documentation contained in this download, “You can use the program to extract data from one or more Exchange mailboxes into .pst files”.  You may wish to run this program if you have to recover some very old data, perhaps as part of a legal discovery process.  For example, if all that exists within an organization are backup tapes, you may have to build up a server, restore from tape, and then use the ExMerge program to extract that user’s old e-mail spool to a PST file for analysis.

 

4.   Converting .dbx/.pst files

 

Ok so you've found your files and copied them over to the forensic directory of your choice.  It’s now time to convert those bad boys into a flat mbox format that will be easily imported into the Mozillla Thunderbird e-mail client or parsed with handy tools.

 

First change into the directory you copied the files into.

 

chimp ~# cd /mnt/evidence/e-mail/dbx

 

Now make a directory or your decoded .dbx files.

 

chimp dbx# mkdir ../decoded

 

After doing this its time to convert the files into our mbox format. We accomplish this by doing a little for loop in our /mnt/evidence/e-mail/dbx directory.

 

chimp dbx# for X in *.dbx; do /$pathto/readdbx -f "$X" -o /$pathof/forensic/directory/"$X.$$"; done

 

Make sure to put the path to your evidence and forensic directory in the above.  The '.$$' appends the process number of the command to the file(not strictly needed but I put it there to identify the decoded files). Now you should have the decoded files in your forensic directory. If you received some errors for readdbx or readpst decoding the files check to see if the decoded files are empty files.  Double check that the original files are empty as well.

 

The procedure for decoding .pst files is similar to the above. The only real change we need to make is to put the output file option before the .pst file, as is shown below.

 

chimp dbx# for X in *.pst; do /$pathto/readpst -o /$pathto/forensicdir/”$X.$$” “$X”; done

 

Sweet! Now we are all decoded and ready to move onto other tools.

 

5.   Viewing decoded .dbx/.pst files with Thunderbird

 

Okay, you successfully decoded the .dbx/.pst files that you are interested in viewing and now you want to do just that view the files...so how do we do that? Read on my friend...

 

This section assumes that you have Mozilla Thunderbird (my e-mail client of choice) installed on your system.  It is beyond the scope of this paper to help you install Thunderbird for your particular system but it should be incredibly easy.  You should be able to import these decoded files into the e-mail client of your choice (in fact I tested this out for Evolution and it works and obviously the mail client in Mozilla is the same a Thunderbird).

 

A little side note a good habit to get into is reinstalling a fresh copy of the OS of your forensic machine for every case you work.  This assures that you have no cross contamination of evidence.  At the very least a fresh install of your e-mail client.

 

To view the decoded mail files in Thunderbird we need to do a little prep work. Fire up Thunderbird and create a new email account that is going to be used to track your suspect mail.


Enter in a bogus SMTP and POP server etc and name the account in a way that will make it easy for you to organize; something like...”Suspect Mail”. It is also important to uncheck the “Use Global Inbox” and the “Download Messages Now” options.

 

The account name should show up in Thunderbird with default compliment of sub-

folders underneath it.         

 

 

Then simply copy the decoded file into your new Thunderbird “Inbox” directory.

 

chimp ~# cp -v /$pathto/decoded/files/inbox /$pathto/ new thunderbird/mail/inbox

 

Now fire up Thunderbird and the files you want to view should appear as a “folder” where you copied them.  If converted file was non-empty the folder you copied it into should have one or more e-mails contained within them.

 


 

Something I have found helpful in organizing my converted and imported suspect mail is to go into the Thunderbird directory and make directories that will delineate it as the suspect's Inbox, Deleted, etc. mail.

 

If you are using Evolution you need to select “File” from the menu and then import.  From there select auto for the import format and where you want to import the file.

 

6.   Converting to HTML with MHONARC

 

Once you have your mbox format file, you may want to archive them in an easily searchable format, or strip off attachments in one fell swoop.  One handy way of doing this is to use the MHONARC program from:  http://www.mhonarc.org/  This is also a very handy way to archive your *own* old e-mail so you can get your hands on old addresses, attachments, etc. without clogging up your e-mail client with gigabytes of data.  Just remember to backup your mbox format files every time you upgrade a server or something and you should be fine.  I personally have years worth of my own mbox files backed up this way, and its very handy.

 

Download and install the package, and read the internal instructions.  In particular, you may choose to write a script to do all the conversion and so on.   My script looks like the following:

 

#!/bin/sh -f

#

./MHonArc-2.6.10/mhonarc yourfile.mbx -add -attachmentdir /path/to/attachments \

 –folrefs  -idxfname index.html -main -multipg -outdir /path/to/htmlemail -reverse

 

This script will open up your file ‘yourfile.mbx’ which is your mbox formatted file, and then copy all the attachments to /path/to/attachments and all the e-mails themselves in a threaded format to /path/to/htmlemail.

 

At this point, you can open up either the threaded or date-sorted HTML index files, or you can grep for interesting information using a command such as

 

chimp dbx# grep badstuff /path/to/htmle-mail/*

 

to find all e-mails with the word ‘badstuff’ in them. You should be aware of case sensitivity for your particular grep program, and obviously also consider the types of keywords that are likely to match such as p0rn, pr0n, etc.  Finding a simple e-mail address, for example to cut out conversations with a particular person, is a piece of cake.

 

7.   Bonus Ideas

 

Here are some bonus ideas and tools.  Suggest some more!

7.1           Converting Eudora e-mail

 

There is a nice script to handle Eudora Mail. It is available at the following site http://www.xs4all.nl/~maryniak/eudora2unix/.

 

7.2           Converting UNIX e-mail

 

Hey, you say, I’ve got UNIX e-mail, how do I analyze it?  Well, luck for you its already in mbox format, so you don’t have to do anything at all.  Just look for mail spool files.  These are sometimes stored in directories such as /var/spool/mail, /var/mail, etc.  You’ll also frequently find mbox format spools in temporary directories.  For example, if you have a bunch of e-mail that couldn’t get delivered (perhaps you were an open mail relay and wanted to see what kind of Viagra or whatnot you were relaying) you may find mbox format files awaiting delivery in /var/spool/mqueue or a similar directory.

 

7.3           Importing mbox into other e-mail clients

 

Say you have an mbox file, and you want to import it into a different e-mail client than Thunderbird and this program doesn’t allow importing, but does work as a POP3 client.  Fortunately, you can easily do this as long as you have a UNIX mail server to do it with.  All you need to do is make an account on the server, copy the mbox file over that user’s e-mail file, usually in /var/mail or /var/spool/mail and then use a POP3 client to download the mail.  All the email will be downloaded to your client as if it were brand new.

 

7.4           Using uudeview to extract attachments

 

Say you are a glutton for punishment, and you really really want to extract attachments from the ASCII MIME-encoded text in your mbox file.  You can do this.  First just cut the e-mail out using a word processor, starting from the first “From:” field, and ending before the next one, and save it as plain text.  Then download uudeview.exe from: http://www.fpx.de/fp/Software/UUDeview/.  Then simply run the program on the text file – it will find the mime-encoded sections, convert them to binary and dump them on the filesystem.  You’ll want to suggest this option if the opposing attorney wants to verify your work, since it is very easy to explain, and makes them work a lot harder.  This is a very handy way to cut naughty pictures out of e-mail so you can insert them into your report.

 

7.5Carving for .eml and using eml2mbox for conversion

 

Using the techniques described above (and after figuring out the header/footer) you could have foremost carve out .eml files (an extension used by some email clients-including Outlook Express-for mail)and use the eml2mbox.rb program available at this site http://www.broobles.com/eml2mbox/ to convert them to mbox format.This program needs the Ruby interpreter to be on your system.  This should be installed on many linux distributions by default and easily obtainable on others (remember emerge?).

 

The website has all the documentation on how to run the script.

 

8.   Summary

 

This article showed you various ways to convert mail files to mbox format and parse them using free and open source tools.  The article should cover the most common forms of Windows based e-mail clients encountered by the forensic examiner, but is only a basic primer. Beyond the scope of this article is web-based e-mail and more advanced types of e-mail such as Novell Groupwise.  We hope the article was informative and helpful to you in your forensic endeavors.

 

The authors welcome all comments and suggestions.

 

 

 

1. It should be noted that some programs will not cross-compile correctly in the pure AMD64 Gentoo environment-notably readpst. If the program is compiled in a 32bit chroot environment-or on an x86 machine- and the proper emulation libraries are installed for the AMD64 box - the binaries will function properly. Obviously the discussion of 32bit chroot environments is beyond the scope of this article.