Advanced Ruby IMAP
Previously, I worked through how to get messages from an IMAP server and work with the message headers. Let’s look at extracting data from those messages. As before we need to connect to the server, authenticate, and select the INBOX:
#!/usr/bin/env ruby
require 'net/imap'
imap = Net::IMAP.new('mail.example.com', ssl: true)
begin
imap.authenticate('PLAIN', 'spike', password)
rescue
abort 'Authentication failed'
end
imap.select('INBOX')
This time we used #select
opening the INBOX read/write so we can
make changes. Let’s grab the oldest, unread message:
ids = imap.search(["UNSEEN"])
id = ids.first
raw_message = imap.fetch(id,'RFC822').first.attr['RFC822']
(#fetch
always returns an Array.)
‘RFC822’ says “request the entire message, in RFC 822 format, as string. As I noted in my last post, the Ruby IMAP library returns some not really useful objects. While it’s possible to extract what we need from them, we’ll get a much cleaner interface if we use the mail ruby Gem.
require `mail`
message = Mail.read_from_string raw_message
The mail gem gives us a much nicer object:
message.subject => "Subject"
message.body.to_s => "This is a test.\n\nBob\n"
And really shines is when there are attachments:
message.multipart? => true
message.parts.map { |p| p.content_type } => ['text/plain', 'application/json']
json = JSON.parse(message.parts[1])
Why would we have a message with a JSON attachment? Because we put it there! Since we have the mail gem:
require 'mail'
require 'json'
report = { bot: 1138, temp: 42, flux: 10 }
mail = Mail.new do
from 'bot1138@example.com'
header['X-Bot-ID'] = '1138'
to 'bot-status-queue@gmail.com'
subject 'Hourly update from Bot 1138'
body 'See attached'
add_file filename: 'status.json', content: report.to_json
end
mail.deliver!
There are also no shortage of ways to email attachments from the command line using out of the box tools, which might be more appropriate than getting Ruby running on small device.
So yeah, that’s really the how, not so much the why. Why is that it’s a fairly easy way to create a distributed network of data sources. For example, a large number of small Internet Of Things devices.
Yes, the more common approach would be to create an API and have the devices ping it. Out of the box, that may even be easier. However, this is way less infrastructure. An AWS EC2 free tier instance and a free Gmail account is all it takes. You can manage a huge volume of incoming data with no more than a cron job. The SMTP protocol will handle network outages and server loading.
One last issue we need to deal with. How to we flag message we’ve
already processed? If we’re very optimistic, the current code would
do. When you open a folder Read/Write (which is what #select
does)
and you read the body of the message with #fetch
(as opposed to just
read the header or other meta data), the message is automatically
marked as “Seen”. So, any message we read will be removed from the
search.
The downside of this approach is that it assumes reading a message is the same as processing it. If our job dies for some reason after reading a message, that message could end up dropped on the floor.
A safer approach is to flag messages that have been successfully processed. The simplest way to do so is to flag the message as “Deleted”:
imap.store(id, "+FLAGS", [:Deleted])
And change our search to find all undeleted messages:
ids = imap.search(['NOT','DELETED'])
This works well and processed messages can easily purged. However, if you want to keep a record of past messages, you probably want to use a different flag, lest the messages be permanently deleted. In that case, the “Flagged” flag will do the job:
imap.uid_store(uid, "+FLAGS", [:Flagged])
(we’ll get at the UID below) and our search becomes:
ids = imap.search(['NOT','DELETED','NOT','FLAGGED'])
Putting it all together and our cron job runs something like:
#!/usr/bin/env ruby
require 'net/imap'
require 'mail'
require 'json'
imap = Net::IMAP.new('mail.example.com', ssl: true)
STDERR.print 'Password: '
password = STDIN.noecho(&:gets).chomp!
puts
begin
imap.authenticate('PLAIN', ENV['IMAP_USER'], ENV['IMAP_PASSWORD'])
rescue
abort 'Authentication failed'
end
imap.select('INBOX')
# You might consider flitering on subject as well.
ids = imap.search(['NOT','DELETED','NOT','FLAGGED'])
imap.fetch(ids,['UID','RFC822']).each do |imap_message|
message = Mail.read_from_string imap_message.attr['RFC822']
attachment = message.attachments.detect {|a| a.content_type.start_with? 'application/json'}
next if attachment.nil?
data = JSON.parse(attachment.body.decoded) # => {"bot"=>1138, "temp"=>42, "flux"=>10}
# Do something important with the data!
uid = imap_message.attr['UID']
imap.uid_store(uid, "+FLAGS", [:Flagged])
end
Even if this isn’t an approach you’d ever need, I hope it gets you thinking about how you can leverage the technologies all around you.
Comments