Its Monday morning and the spam must flow
During the last month several new spam networks have been flooding my servers. Instead of being just a few IP addresses here and there, they’ve been using huge subnets to send their spam. My old system of banning single IP addresses just wasn’t keeping up.
Last week I started working on a new system that would be better able to track messages as they are handled by my mail servers. I run about a half-dozen mail servers for myself and for clients, and I’ve set them all up so that they all send their log entries to a central server through syslog.
I spent some time making a new system which could re-assemble the multi-line log entries. Sendmail doesn’t log each message with a single line, the entries are almost always split over several lines. The lines will get mingled with other entries as well as with entries from other processes like imapd and spamassassin. In order to track log entries I built a system that pushes log lines onto a stack and then tries to determine if it has all the entries. If an entry stays on the stack too long it is discarded to prevent the stack from growing infinitely in the case where a line was missed and the entry can never be completed.
Since I wanted to have my log watcher run as a daemon, it’s very important that it doesn’t end up running away filling up all the RAM until it crashes. I spent a lot of time hunting down memory leaks in my program and in my library, but was really struggling to find them all. I was able to get my code down to just two lines that would exhibit the memory leak problem but no matter what I tried I couldn’t find where things were leaking from. Switching development over to a computer with a more recent Linux install so that I could use valgrind, I suddenly found that I didn’t have any memory leaks! Looking at the difference between the two computers, the only real difference was the gcc version. Turns out I was battling memory leaks that were in the GNU Objective-C runtime which hadn’t been fixed until gcc 4.7! No wonder I was having such a hard time finding leaks in my code!
With log entries being assembled the next step was to start writing everything to a database and then using that data to start blacklisting spammers. Another new thing I wanted to do was to find the subnet that an IP was coming from so that I could start cutting off entire spam nets instead of only cutting off one server at a time.
It took a few more days to get writing to the database working along with being able to scrape subnet data and parse it and record it in the database too. By the time I got the service working which would ban by IP address it was Sunday. It seemed like my new system was working great since all week while developing I was seeing five to ten spams per minute hit my servers and now with the new system in place I was seeing one attempt every five minutes or more.
Monday morning arrived and suddenly the amount of spam went back through the roof. This wasn’t entirely unexpected. For years I’ve noticed that spam drops off on Sunday, but when Monday morning comes spam goes crazy. Sundays usually get a tenth the amount of spam as the rest of the week.
Why is there so much less spam on Sunday vs. the rest of the week? Do spammers think that since people won’t be reading their mail on Sunday, they should wait until everyone is back at work? Maybe they think that since people will be bored at work they’ll be more likely to click spam? Well, that doesn’t make much sense, since who cares when the spam is delivered since it will wait in the user’s mailbox until they check it.
No, obviously what’s going on is that most spam is being sent from hijacked desktop computers. On Monday morning the weekend is over and everyone is turning their computers on. How many though are home computers and how many are company computers? I’m not sure yet, but almost all of them are in the US. Even though I’ve only had my new program running for a few days I can see that the US is sending 10 times the amount of spam as the next runner up, currently Chile. More system administrators need to block outgoing port 25. All of those computers shouldn’t be able to directly send out messages.
I now have a better system for tracking and logging spam, but my work isn’t done. I need to build some tools which will give me nice reports, as well as try to figure out a way to make it easy for my users to flag the things that are still slipping through.