Ruby on Rails


Over a year ago I wrote the wildly popular Encrypting Lots of Sensitive Data with Ruby (on Rails). At the end I said:

Clearly, this screams for a plugin; watch this space.

Well, it took a while and it turned out to be a gem, but Strongbox has arrived.

First a recap:

You have a web application and you need to encrypt the data your receive from your users. The most common form of encryption is symmetric-key encryption, where one password is used for both encryption and decryption. This works very well, but it means that everyone who enters data needs to know the password and everyone who knows the password can decrypt the data.

Enter Public-key cryptography which used one password (key) to encrypt and a different key to decrypt. This solves the problem; make the encryption password, the public key, available to your application, and keep the decryption password, the private key, well, private. Users don’t need to know or care how, they fill out a form and the data gets encrypted. One small problem, size. The most you can practically encrypt using this method is 245 bytes. Good enough for the launch codes, but not so for driving directions to the buried treasure.

No problem, if we have larger data, we simply combine the two. We generate a random password and use it to symmetrically encrypt to data. We then use the public key to encrypt the random password. To get the data back, the private key is used to decrypt the random password which is in turn used to decrypt the original data.

Got it? Good. Strongbox takes the above three paragraphs and reduces them to this:

class User < ActiveRecord::Base
  encrypt_with_public_key :secret,
                          :key_pair => 'path/to/keypair.pem'
end

>> @user = User.new
>> @user.secret = 'Ssssh'
>> @user.secret  # => "*encrypted*"
>> user.secret.decrypt ‘letmein’ # => "Ssssh"

OK, it’s sightly more complex. The column “secret” needs to exist in the database and be type “binary” (more on this in a bit). In additional, because we are using symmetric encryption (the default), we need two binary columns “secret_key” and “secret_iv” to store the generated symmetric key and Initialization vector (IV) (which you can think of as a second key (but it’s not) once they are encrypted with the public key.

If you are certain that the data you are encrypting won’t be larger than 245 bytes, you can use the following:

class User < ActiveRecord::Base
  encrypt_with_public_key :secret,
                          :key_pair => 'path/to/keypair.pem',
                          :symmetric => :never
  validates_length_of :secret, :maximum=> 245
end

This skips the symmetric encryption, is faster, and you only need the binary “secret” column.

You’ll also need to generate a key pair. Be sure to choose a strong pass phrase, as this is the one that will decrypt everything (as always, I suggest using Diceware).


% openssl genrsa -des3 -out private.pem 2048
Generating RSA private key, 2048 bit long modulus
......+++
.+++
e is 65537 (0x10001)
Enter pass phrase for key_pair.pem:
Verifying - Enter pass phrase for key_pair.pem:

If you aren’t going to be decrypting data on a regular basis you might want to deploy just the public key. Extract it:


mv key_pair.pem private.pem
openssl rsa -in private.pem -out public.pem -outform PEM -pubout
Enter pass phrase for private.pem:
writing RSA key

And change your model:

class User < ActiveRecord::Base
  encrypt_with_public_key :secret,
                          :public_key => 'path/to/public.pem'
                          :private_key => 'path/to/private.pem'
end

You could then have rake/Capistrano task to deploy and remove the private key as needed. Or you could limit it’s use to a separate, non-public, server.

As noted above you want your database column(s) to be binary. If your database does not have a binary type you can add the :base64 option:

class User < ActiveRecord::Base
  encrypt_with_public_key :secret,
                          :key_pair => 'path/to/keypair.pem',
                          :base64 => :true
end

This will convert the binary data to text using Base64. You must, must make your column type “text”. Base64 increases the length of the data it encodes by approximately 137%. Type “string” is typically 256 bytes, 245 * 1.37 = 335.65 bytes. If you use a “string” column and encrypt anything greater than 186 bytes your data will be lost.

Finally, there are two addition options for tweaking the encryption settings that you are unlikely to need:

“:symmetric_cipher” lets you change the algorithm that’s used for symmetric encryption. The default is 256 bit Advanced Encryption Standard (AES) using Cipher Block Chaining (CBC) (‘aes-256-cbc’ in OpenSSL terms). AES has been approved by the NSA for top secret information, so it’s probably good enough, but Blowfish (‘bf-cbc’) is know to work as well. Other ciphers in CBC mode should also work, but have not been tested by me. (Note that all ciphers may not be supported by your version of OpenSSL, “openssl list-cipher-commands” will provide a list.)

“:padding” allows you to change the method used to pad data encrypted with the public key. Unless you are working with legacy data, you shouldn’t need to change this. The default is “RSA_PKCS1_PADDING”, see the code if you need other options.

Disclaimer

I am not a security expert. This software using an off the shelf encryption tool, namely OpenSSL, that has been well tested, but that is not a guarantee that this implementation doesn’t have weakness. Be sure you understand what Strongbox does, and review it for your application. A few things to keep in mind:

  • Strongbox encrypts the data as it is saved, but no sooner. Be sure to use HTTPS for submitting the forms (and decrypting data!).
  • If an attacker gains entry to your system the encryption should protect your data. However, they might be able to hack your code to intercept new data or, much worse, your private key password. Protect your server.
  • When decrypting make sure your data isn’t cached.

And test, test, test. If there is a problem with how your data is encrypted, there is no getting it back.

One concern I have is garbage collection. If you decrypt something into a variable which then goes out of scope, how long does it hang around in memory? Can you force it out? I haven’t found much information on this; if there are any Ruby GC experts out there, share your knowledge!

I am always open to suggestions and improvement, but, to quote the License:

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY
KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.

So there!

Finally, I’d like to give a shout-out to thoughtbot. While this software has existed for many years, the final form of the gem was greatly inspired by Paperclip. It’s a nice example of an approach to adding complex features to an ActiveRecord attribute and how to test them. In additional, the gem sat unpublished for nearly a year because it needed test coverage and my testing was week. Then I took thoughtbot’s Advanced Ruby on Rails class which really helped me get my head around testing and TDD, and got this moving again. If you know Rails, but need to improve your processes, I highly recommend this class.

So, you have a column in your database you can’t update after the record is created. Not don’t want to update, but can’t. Specifically, you might have a column that is protected by a trigger, which will cause an error if that column is included in a update. How do you prevent ActiveRecord from trying to update that column?

Prior to Rails 2.0, ActiveRecord will always generate an SQL UPDATE statement that includes all of the attributes in the model, even if they hadn’t changed.

product = Product.find(:first)
=> #<Product:0x23a7d00 @attributes={"name"=>"Product", "description"=>"Lorem ipsum",  "price"=>"9.99", "sku" => "000001"}>
# Inflation
product.price += 1.00.to_d # You're not using floats for prices, are you?
product.save

=> UPDATE products SET `price` = '10.99', `available` = 1, `description` = 'Lorem ipsum',  `name` = 'Product', `sku` = '000001' WHERE `id` = 1

If “sku” happens to be read-only, the update will fail, and so will your app.

The right way to fix this is to upgrade to Rails 2.x. Starting 2.0 you can use attr_readonly which (silently) removes the attribute from the UPDATE statement.

attr_readonly :sku
product = Product.find(:first)
=> #<Product:0x23a7d00 @attributes={"name"=>"Product", "description"=>"Lorem ipsum",  "price"=>"9.99", "sku" => "000001"}>
product.price += 1.00.to_d
product.save

=> UPDATE products SET `price` = '10.99', `available` = 1, `description` = 'Lorem ipsum',  `name` = 'Product' WHERE `id` = 1

And, starting with 2.1, ActiveRecord only updates attributes that have been changed. As long as you don’t change the value of an attribute, it won’t be included in the UPDATE statement.

product = Product.find(:first)
=> #<Product:0x23a7d00 @attributes={"name"=>"Product", "description"=>"Lorem ipsum",  "price"=>"9.99", "sku" => "000001"}>
product.price  += 1.00.to_d
product.save

=> UPDATE `products` SET `price` = '10.99'  WHERE `id` = 1

(Obviously, it’s better to explicitly mark an attribute as read-only then to depend on this behavior.)

But, what if you are working with a pre 2.X version of Rail? As I said above, ActiveRecord generates the UPDATE statement based on the attributes in the model. The trick, or should I say ugly hack, is to load the record with only the fields you want to update using :select. This way, when the UPDATE is generate it will only include those attributes that were loaded into the record.

product = Product.find(:first, :select => 'id, name, price')
=> #<Product:0x23a7d00 @attributes={"name"=>"Product", "price"=>"9.99", "id"=>"1"}>
product.price  += 1.00.to_d
product.save

=> UPDATE `products` SET `price` = '10.99',   `name` = 'Product' WHERE `id` = 1

When doing this, you need to include the “id” column (or whatever your primary key is) in the select. Also note that while this will work with find_by_ methods, find_or_initialize_by_ methods do not take the :select option.

Yup, it’s ugly but, it does, in fact, work.

Rick Olson’s attachment_fu is a great plugin for attaching files documents to Rails models. It’s a rewrite of his acts_as_attachment plugin. While it can handle any kind of file data, most commonly, it is used for attaching images; as a result attachment_fu handles automatic resizing of images, and creation of thumbnails using RMagick, minimagick, or ImageScience.

For example:

class ProductImage  < ActiveRecord::Base
  belongs_to :product
  has_attachment :content_type => :image,
                 :storage => :file_system,
                 :path_prefix => '/public/images/products/',
                 :resize_to => '300',
                 :thumbnails => {:thumb => '75x75' }

  validates_as_attachment
end

The above will take an image, resize it to 300 pixels wide (automatically adjusting the height to preserve the original images aspect ratio), and to 75 by 75 pixels for a thumbnail, and save resulting images. Combined with a Product model that has_one :image, or has_many :images, and the right form, you can easily manage your product images.

However, an image with both a fixed width, and fixed height, like our thumbnail, can be a problem. If the original, and resized image do not have the same aspect ratio the resized image will be distorted. In this case, if the original is not square, our thumbnail will be look squished in which ever dimension was longer originally. This is not a problem for the main image because we let the height be calculated automatically.

Fortunately, there is a simple trick that allows us to override the method attachment_fu uses to resize image and manipulate it ourselves. Add the following to the ProductImage model:

  protected

  # Override image resizing method
  def resize_image(img, size)
    # resize_image take size in a number of formats, we just want
    # Strings in the form of "crop: WxH"
    if (size.is_a?(String) && size =~ /^crop: (\\d*)x(\\d*)/i) ||
        (size.is_a?(Array) && size.first.is_a?(String) &&
          size.first =~ /^crop: (\\d*)x(\\d*)/i)
      img.crop_resized!($1.to_i, $2.to_i)
      # We need to save the resized image in the same way the
      # orignal does.
      self.temp_path = write_to_temp_file(img.to_blob)
    else
      super # Otherwise let attachment_fu handle it
    end
  end

and change the thumbnail size to:


:thumbnails => {:thumb => 'crop: 75x75' }

Now, if the image size starts with ‘crop: ‘, the image will be resized and then cropped to fit. Otherwise, it’s passed on to attachment_fu and handed normally. I’m using the RMagic crop_resized! method, which resize the image using the smaller dimension and then crops the large one to fit. If you are using minimagick, or ImageScience you may need to fiddle a bit with the code. Obviously, you can extend this approach to manipulate the image anyway you see fit. For example you could automatically put a border on the images:

  def resize_image(img, size)
    # Add a 2x2 red border and pass the image to attachment_fu
    img.border!(2,2,'red')
    super
  end

Or blur them:

  def resize_image(img, size)
    img = img.blur_image
    super # Pass the blured image to attachment_fu
  end

Or any other weirdness your heart desires. Have fun!

Previously I wrote about how to use public key encryption to automatically encrypt data using Ruby (and thus Rails). Because this method can encrypt data without a password, it’s very useful for securing information received from a form, without the person entering the from having to do anything special. However public key encryption has limits; the amount of data you can encrypt with this method is limited by the key size you use, and, that after a point increasing the key size isn’t a practical an option. We solve this problem using a combination of public key encryption and symmetric-key encryption.

First a little theory; and when I say “a little” I mean it. Under the hood encryption is a headache inducing branch of mathematics. If you want to the literal truth, there is plenty of good reading out there.

Symmetric-key cryptography is what people tend to think of when, and if, they think of encryption. A password is used to encrypt some information, and that same password must be entered to retrieve the information. Under the hood is an algorithm or cipher, which simply put, is a mathematical function that transforms the data into something obscure and then back again. For our purpose we need a block ciphers.
A block cipher takes a small, typically 128 or 256 bit, chunk of data and encrypts it. There are many, but in this example we’ll use the Advanced Encryption Standard which is the de facto standard.

(There also exist stream ciphers which work on streams of data, encrypting a phone call for example, but that trade security for speed and can be difficult to use correctly.)

We also need a little glue. Because block ciphers operate on small chunks of data, they need to be applied again and again. However give the same input data the cipher will always produce the same encrypted output; any redundancies in the input will be exposed as redundancies in the output and make in vulnerable to a number of attacks. To avoid this we use a mode of operation called Cipher Block Chaining. CBC using data from one block to further obfuscate the data in the next, effectively hiding any redundancies.

While simple and secure, using symmetric-key cryptography can be problematic; everyone needs to know the password to encrypt data and everyone who has the password (or as the pros say, shared secret key) can decrypt data. This works well if you small set of people who need to know the password and a secure way to distribute it (over drinks in a dark corner of a seedy bar is poetic, if not necessarily secure) but in the case of a web site with hundreds or thousands of people entering data, it’s not practical.

Public-key cryptography can be thought of as Symmetric-key encryption with two passwords, or keys, called a key pair. One, the public key, that encrypts data and another, the private key that decrypts. Because the public key can not be used to decrypt data it encrypts it can be safely given out or installed on a web site, allowing anyone to encrypt data to be sent to the owner of the key. The private key is kept safe and is typically symmetric encrypted with an additional password

The solution is actually quite simple. We generate a random password and use that for the symmetric-key encryption. We then encrypt the random password using the public key, and store both the encrypted password, and encrypted data. When we need to get at the data, we use the private key, and its password to decrypt the random password which, in turn, is used to decrypt the data.

Well, it’s almost that simple. In order to randomize the data, the CBC glue requires a Initialization vector (IV). This article has a good explanation of why, but for our purposes we can just think of it as a second random password we need to encrypt and save.

OK, enough talk, let’s encrypt some text:

# OpenSSL provides both symmetric and public key encryption
require 'openssl'

# Encrypt with 256 bit AES with CBC
cipher = OpenSSL::Cipher::Cipher.new('aes-256-cbc')
cipher.encrypt # We are encypting
# The OpenSSL library will generate random keys and IVs
cipher.key = random_key = cipher.random_key
cipher.iv = random_iv = cipher.random_iv

encrypted_data = cipher.update(plain_data) # Encrypt the data.
encrypted_data << cipher.final

At this point we could just save encrypted_data in the database and it would be well protected. So well in fact that we couldn’t get it back. To do that we’re going to need to save the random password and IV.

Generate a key pair. Be sure to choose a good password as this is the one that will decrypt everything.


% openssl genrsa -des3 -out private.pem 2048
Generating RSA private key, 2048 bit long modulus
......+++
.+++
e is 65537 (0x10001)
Enter pass phrase for private.pem:
Verifying - Enter pass phrase for private.pem:

Now extract the public key:


openssl rsa -in private.pem -out public.pem -outform PEM -pubout
Enter pass phrase for private.pem:
writing RSA key

See my previous article for more details on what we’re doing here.

Now we can use the public key to encrypt the random key and IV:

public_key_file = 'public.pem';

public_key = OpenSSL::PKey::RSA.new(File.read(public_key_file))

encrypted_key = public_key.public_encrypt(random_key)
encrypted_iv = public_key.public_encrypt(random_iv)

Now if we store all three pieces, encrypted_key, encrypted_iv, and encrypted_data we have successfully encrypted our original data.

Of course we’ll want to get that data back, and to do so we reverse the process:

require 'openssl'

private_key_file = 'private.pem';

private_key =
   OpenSSL::PKey::RSA.new(File.read(private_key_file),password)

cipher = OpenSSL::Cipher::Cipher.new('aes-256-cbc')
cipher.decrypt
cipher.key = private_key.private_decrypt(encrypted_key)
cipher.iv = private_key.private_decrypt(encrypted_iv)

decrypted_data = cipher.update(encrypted_data)
decrypted_data << cipher.final

password is the password you used when generating the key-pair.

Now let’s put it all together in an Active Record model:

class Sensitive < ActiveRecord::Base

  attr_accessor :plain_data
  attr_protected :encrypted_data, :encrypted_key, :encrypted_iv
  before_save :encrypt_sensitive

  def decrypt_sensitive(password)
    if self.encrypted_data
      private_key = OpenSSL::PKey::RSA.new(File.read(APP_CONFIG['private_key']),password)
      cipher = OpenSSL::Cipher::Cipher.new('aes-256-cbc')
      cipher.decrypt
      cipher.key = private_key.private_decrypt(self.encrypted_key)
      cipher.iv = private_key.private_decrypt(self.encrypted_iv)

      decrypted_data = cipher.update(self.encrypted_data)
      decrypted_data << cipher.final
    else
      ''
    end
  end

  def clear_sensitive
    self.encrypted_data = self.encrypted_key = self.encrypted_iv = nil
  end

  private

  def encrypt_sensitive
    if !self.plain_data.blank?
      public_key = OpenSSL::PKey::RSA.new(File.read(APP_CONFIG['public_key']))
      cipher = OpenSSL::Cipher::Cipher.new('aes-256-cbc')
      cipher.encrypt
      cipher.key = random_key = cipher.random_key
      cipher.iv = random_iv = cipher.random_iv

      self.encrypted_data = cipher.update(self.plain_data)
      self.encrypted_data << cipher.final

      self.encrypted_key =  public_key.public_encrypt(random_key)
      self.encrypted_iv = public_key.public_encrypt(random_iv)
    end
  end
end

When creating or updating your model you don’t have to do anything, if “plain_data” is present it will be automatically encrypted. When you want to view the plain text you call “@record.decrypt_sensitive(‘passwd’)”; that could be done with a little AJAX that prompts for a password and populates the “plain_data” field. The encrypted data is only updated when “plain_data” is present. This done so that the record can be updated without decrypting (and re-encrypting) encrypted data (handy in an application were not everyone has access to the sensitive data). To actually clear the encrypted data call “@record.clear_sensitive” and then save.

Setting up the APP_CONFIG hash is left as an exercise for the reader.

Clearly, this screams for a plugin; watch this space.

In Encrypting Sensitive Data with Perl I wrote about how to use public key encryption to automatically and securely encrypt information with Perl. This allows you encryption things like credit card numbers, bank routing information, or that winning PowerBall number in a unattended fashion. Typically, you would use this in a situation where a user needs to enter sensitive information into a form which need to be stored in a secure manner. We can do this with Ruby (on Rails) as well, and it’s even easier.

First we need to generate a key pair. This creates two keys, a public key which will only be used to encrypt data, and a private key, which will only be used to decrypt data. The private key is protected by a password know only to us. When it comes to choosing strong passwords, I suggest using Diceware. 2048 is the key size in bits. Bigger is better, but also slower; 2048 is considered a good trade off between speed and encryption strength. We are also limited by this to encrypting as most 2048 bits, more on this below.


% openssl genrsa -des3 -out private.pem 2048
Generating RSA private key, 2048 bit long modulus
......+++
.+++
e is 65537 (0x10001)
Enter pass phrase for private.pem:
Verifying - Enter pass phrase for private.pem:

Then we extract the public key:


openssl rsa -in private.pem -out public.pem -outform PEM -pubout
Enter pass phrase for private.pem:
writing RSA key

Once we have the keys, we can encrypt data using the following:

#!/usr/bin/env ruby

require 'openssl'
require 'base64'

public_key_file = 'public.pem';
string = 'Hello World!';

public_key =
   OpenSSL::PKey::RSA.new(File.read(public_key_file))
encrypted_string =
   Base64.encode64(public_key.public_encrypt(string))

print encrypted_string, "\n"

Simply, public_key_file is path to the file containing the public key, and string is the string to encrypt. We open the public key and then use public_encrypt to encrypt it. Because the encrypted string is binary I have converted to text using Base64. If your are storing the encrypted string in a database that can hold binary data, you could change:

encrypted_string = Base64.encode64(public_key.public_encrypt(string))

to:

encrypted_string = public_key.public_encrypt(string)

Now that we have encrypted data, we’ll want to be able to get it back.

#!/usr/bin/env ruby

require 'openssl'
require 'base64'

private_key_file = 'private.pem';
password = 'boost facile'

encrypted_string = %Q{
qBF3gjF8iKhDh+g+TOvAzBkJA/1d2lD8RUyz2Ol+s1OpLB5aA3RA7EHm0KGL
XaP3upvJ7I5rN1yO9Qat9kyRQu9OMqAUmFvwUaiW/1NPjxnpmcFn9mhkttP9
qfO6iIfyxErUqKIxHYqavyPmivre9eEcXiBdtIK6NJJKG3WmSfIFgpZ6eBWI
wxlZg+x0fI4L2JsODMGx5Khn7CUt0bTkH6HMHwxEG24NbsmrqtC2zn8Hm/87
UyN5ZCDyJ/mtIHAjzPry6vbVPTF0QCR4lZ7uSt/W7JZ0tNgX7eQQwoPCgbqU
/uwRCwww/c407jw7YEE5Lgpx20/jyLXJwvZHxNEcxA==
}

private_key =
  OpenSSL::PKey::RSA.new(File.read(private_key_file),password)

string =
  private_key.private_decrypt(Base64.decode64(encrypted_string))

print string, "\n"

Here private_key_file is path to the file containing the private key, password and encrypted_string is the string to decrypt. In a real application you would not want to hard-code the password, rather you should prompt for it in some way.

Again we are using Base64 to make the encrypted string human readable. If this is not necessary, change:

string = private_key.private_decrypt(Base64.decode64(encrypted_string))

to:

string = private_key.private_decrypt(encrypted_string)

As noted above, you can not use this method to encrypt anything larger than the key size minus 11 bytes of overhead (padding). In this case we have a 2048 bit key which gives 256 – 11 = 245 bytes. The temptation is to increase the key size to accommodate more data, but this quickly become to slow to be useful. The correct way to accomplish this is to use public key encryption to encrypt random password, which, in turn is used to encrypt the data using symmetric-key encryption. I’ll cover this next time.