Skip to content


bash script to fix e-mail message dates on OS X Server with dovecot

Recently I moved a Cyrus mailstore on a dying XServe to a Dovecot mailstore on a new Mac mini.

I f**ked up and copied files without the -p option to preserve the file dates.

cp -p     #Doh!

Mail clients like OS X’s Mail.app do not read the mail header for the actual message’s received-date and uses the file modification date. Now I have hundreds of thousands of mail messages from 2004 that say they are from Dec 31, 2019 (Yes, I was working New Year’s Eve).

So I had to find a way to read the dates from the messages and set the modification dates on each message file. As a lazy hacker, I looked for somebody that already had the same problem. I found Steve Major’s example. It was easy to follow and helped me see the limitations of BSD on OS X. There were other example scripts for Linux but those wouldn’t have worked for OS X. Mainly, the -d option for the date command had a totally different implementation.

date -d #OS X is not Linux

Like I said, I’ve been running a server since 2004, so I was unsure if all the mail headers would include a “Date” field compliant with RFC2822. They didn’t.

I had to take the largest common denominator of date formats and try to get them to conform to RFC2822. So, I first wrote a quick script that parsed the files and wrote all the Date fields to file for examination. I quickly saw that 99% of the messages could be forced to conform to the format that I needed with minimal intervention.

So I took Steve’s code and modified it to my collection of messages. And in the meantime, figuring out that OS X BSD can reformat dates if it was sanitized to a known format.

date -j -f "%d %b %Y %T %z" "$datestring" "+%Y%m%d%H%M.%S

This is where I found out I didn’t have to manually convert the Month names (%B) to Month numbers (%b), but I had to sanitize the Timezone (%Z) to a UTC offset (%z)… and that the legacy timezone format we’re all familiar with like “CST” was no longer kosher with UNIX. In addition to that, I would have to force the OS X shell to read ALL locales if I wanted to convert legacy timezones automatically to UTC offsets. But because I pre-examined those Date fields, I noticed that only users in the US had their UTC offset missing and only had legacy TZ. All other countries are SO used to handling UTC offsets, it was a given.

So after two weeks of programming in-between grocery shopping and flipping houses, I have finally created:

xdovecotdates.sh

#!/bin/sh
#
# xdovecotdates.sh
# Last update: Jan 18, 2020 America -0600
# Modified for OS X Server 10.1.x~10.13.x by Celia Wessen <celia@celiamania.com> ©2020
#
# Inspired by "fixdovecotdates.sh" written by Steve Major <steve@themajorshome.com> ©2015
#
# This version will run for Mac OS X's version of "date"
# DANGER WILL ROBINSON! MUST BE RUN UNDER sudo
#
# This shell script parses a dovecot mailbox 
# under the given directory, searches through each message for the "Date:"
# entry (the date and time the server received the message), and uses touch 
# to rewrite the file's last modified date to that date so it shows up correctly
# for mail-clients that rely on the file's last-modified date.
#
######################################################################
#echo "THE SAFETY IS ON!" ; exit # Don't run while editing
#
# Date string errors will be written to a log so you can dissect your formatting problems
ERR_LOG="/tmp/xdovecotdates_error.log"
# Set your local UTC offset as a default
LOCALOFFSET="-0600"
######################################################################

# Here we set BASHs internal IFS variable so directories/filenames are not broken into new lines when a space is found.
IFS=$'\n'
echo "\n IFS is SET \n"

didrun="0"		# Constant - If any timestamps were touched. YOU DO NOT TOUCH
haderrors="0"	# Constant - If any errors existed. YOU DO NOT TOUCH

STARTDIR=$1		# User mail directory

# Was the command given an input directory?
if [ "$STARTDIR" = "" ] || [ "$STARTDIR" = "?" ] || [[ "$STARTDIR" = *-h* ]]; then
	echo "\n Usage: xdovecotdates.sh [UserMailDirectory] to parse. \n"
	exit
fi
# Are there any messages in the input directory?
if [[ ! "$(ls -A $STARTDIR)" || ! "$(ls -A $STARTDIR/cur)"  ]]; then
	echo "\n There are no messages in the directory: "$STARTDIR "\n"
	exit
fi

# Batch processing begins!
for f in $STARTDIR/cur/*
do
	timestamp="197001010000.00"			# Reset timestamp in case last one had errors
	
	echo "Processing "$f
	# Get the date string
	datestring=`grep -i -m1 "^Date: " $f`	# Find FIRST occurance of "Date: " cAse-iNsensitive
	##echo "Parsed Date String: "$datestring

	# Check if Date String includes RFC2822 Day of Week with a comma (,)
	# Then get the timezone offset from correct field
	# Currently there is no check for Day of Week format non-compliant with RFC2822
	if [[ $datestring == *","* ]]; then
		# RFC2822 Week of Day exists in the string
		offset=`echo $datestring | awk '{print $7}'`
		##echo "Date before Sanitizing: "$datestring
		datestring=`echo $datestring | awk '{print $3" "$4" "$5" "$6}'`
		##echo "Date after Sanitizing: "$datestring
	else
		# RFC2822 Week of Day DOES NOT exist in string
		offset=`echo $datestring | awk '{print $6}'`
		##echo "Date before Sanitizing: "$datestring
		datestring=`echo $datestring | awk '{print $2" "$3" "$4" "$5}'`
		##echo "Date before Sanitizing: "$datestring

	fi
	##echo "Offset Parsed: "$offset

	# If the offset is in legacy TZ format, put in +/- UTC offset format.
	# Currently only compatible with US timezones.
	if [[ "${offset:0:1}" != "+" && "${offset:0:1}" != "-" ]]; then
		##echo "Not a UTC offset. "$offset" may be a Legacy Timezone."
		# Remove parenthesis
		if [[ "${offset:0:1}" == "(" ]]; then
			offset=`echo $offset | cut -c2-4`
		fi
		# Probably somebody can make a LEGACY_TZ file in the future
		case $offset in
			GMT)
			offset="+0000";;
			EDT)
			offset="-0400";;
			EST | CDT)
			offset="-0500";;
			CST | MDT)
			offset="-0600";;
			MST | PDT)
			offset="-0700";;
			PST | AKD)
			offset="-0800";;
			HDT | AKS)
			offset="-0900";;
			HST)
			offset="-1000";;
			*)
			offset="$LOCALOFFSET";;
		esac 
	##echo "Offset Corrected: "$offset
	fi

	# If Time is missing Seconds, add 00 seconds
	timestring=`echo $datestring|cut -d " " -f4`
	if [[ "${#timestring}" -le "5" ]]; then
		datestring=$datestring":00"
		##echo "Added 00 seconds: "$datestring
	fi

	# Create a timestamp from sanitized Date String
	# Make sure that timestamp can be correctly formatted 
	if [[ `date -j -f "%d %b %Y %T %z" "$datestring $offset" "+%Y%m%d%H%M.%S" 2> /dev/null` ]]; then
		timestamp=`date -j -f "%d %b %Y %T %z" "$datestring $offset" "+%Y%m%d%H%M.%S"`	
		##echo "Message Timestamp: "$timestamp 
		##echo "test-touch -t "$timestamp $f
		touch -t $timestamp $f
		didrun="1"
	else
		echo "ERROR: Date string format mismatch - "$datestring	$offset					#Write error to stdout
		##echo "ERROR: Date string format mismatch - "datestring\n$f >> $ERR_LOG ;	#Write error to error log
		echo $f >> $ERR_LOG	;														#Write filenames ONLY to logfile
		haderrors="1"
	fi
done

# Check if any messages dates were changed
if [[ $didrun=="1" ]]; then
	touch $STARTDIR/cur
fi

echo "..."
# Reset the index file
rm $STARTDIR/dovecot.index.cache	
echo "Cache has been reset."
echo "......"
unset IFS
echo "IFS is UNSET"
if [[ $haderrors == "1" ]]; then
	echo "\n There were some errors. See "$ERR_LOG" \n"
else
	echo "\n There were no errors. \n"
fi
echo "\n Dates should be fixed now... don't sue me.\n"

Posted in Apple, English.


0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

You must be logged in to post a comment.