Previously I wrote about how to use public key encryption to automatically encrypt data using Ruby (and thus Rails). Because this method can encrypt data without a password, it’s very useful for securing information received from a form, without the person entering the from having to do anything special. However public key encryption has limits; the amount of data you can encrypt with this method is limited by the key size you use, and, that after a point increasing the key size isn’t a practical an option. We solve this problem using a combination of public key encryption and symmetric-key encryption.
First a little theory; and when I say “a little” I mean it. Under the hood encryption is a headache inducing branch of mathematics. If you want to the literal truth, there is plenty of good reading out there.
Symmetric-key cryptography is what people tend to think of when, and if, they think of encryption. A password is used to encrypt some information, and that same password must be entered to retrieve the information. Under the hood is an algorithm or cipher, which simply put, is a mathematical function that transforms the data into something obscure and then back again. For our purpose we need a block ciphers. A block cipher takes a small, typically 128 or 256 bit, chunk of data and encrypts it. There are many, but in this example we’ll use the Advanced Encryption Standard which is the de facto standard.
(There also exist stream ciphers which work on streams of data, encrypting a phone call for example, but that trade security for speed and can be difficult to use correctly.)
We also need a little glue. Because block ciphers operate on small chunks of data, they need to be applied again and again. However give the same input data the cipher will always produce the same encrypted output; any redundancies in the input will be exposed as redundancies in the output and make in vulnerable to a number of attacks. To avoid this we use a mode of operation called Cipher Block Chaining. CBC using data from one block to further obfuscate the data in the next, effectively hiding any redundancies.
While simple and secure, using symmetric-key cryptography can be problematic; everyone needs to know the password to encrypt data and everyone who has the password (or as the pros say, shared secret key) can decrypt data. This works well if you small set of people who need to know the password and a secure way to distribute it (over drinks in a dark corner of a seedy bar is poetic, if not necessarily secure) but in the case of a web site with hundreds or thousands of people entering data, it’s not practical.
Public-key cryptography can be thought of as Symmetric-key encryption with two passwords, or keys, called a key pair. One, the public key, that encrypts data and another, the private key that decrypts. Because the public key can not be used to decrypt data it encrypts it can be safely given out or installed on a web site, allowing anyone to encrypt data to be sent to the owner of the key. The private key is kept safe and is typically symmetric encrypted with an additional password
The solution is actually quite simple. We generate a random password and use that for the symmetric-key encryption. We then encrypt the random password using the public key, and store both the encrypted password, and encrypted data. When we need to get at the data, we use the private key, and its password to decrypt the random password which, in turn, is used to decrypt the data.
Well, it’s almost that simple. In order to randomize the data, the CBC glue requires a Initialization vector (IV). This article has a good explanation of why, but for our purposes we can just think of it as a second random password we need to encrypt and save.
OK, enough talk, let’s encrypt some text:
1 2 3 4 5 6 7 8 9 10 11 12
At this point we could just save encrypted_data in the database and it would be well protected. So well in fact that we couldn’t get it back. To do that we’re going to need to save the random password and IV.
Generate a key pair. Be sure to choose a good password as this is the one that will decrypt everything.
1 2 3 4 5 6 7
Now extract the public key:
1 2 3
See my previous article for more details on what we’re doing here.
Now we can use the public key to encrypt the random key and IV:
1 2 3 4 5 6
Now if we store all three pieces, encrypted_key, encrypted_iv, and encrypted_data we have successfully encrypted our original data.
Of course we’ll want to get that data back, and to do so we reverse the process:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
password is the password you used when generating the key-pair.
Now let’s put it all together in an Active Record model:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
When creating or updating your model you don’t have to do anything, if “plain_data” is present it will be automatically encrypted. When you want to view the plain text you call “@record.decrypt_sensitive(‘passwd’)”; that could be done with a little AJAX that prompts for a password and populates the “plain_data” field. The encrypted data is only updated when “plain_data” is present. This done so that the record can be updated without decrypting (and re-encrypting) encrypted data (handy in an application were not everyone has access to the sensitive data). To actually clear the encrypted data call “@record.clear_sensitive” and then save.
Setting up the APP_CONFIG hash is left as an exercise for the reader.
Clearly, this screams for a plugin; watch this space.
In Encrypting Sensitive Data with Perl I wrote about how to use public key encryption to automatically and securely encrypt information with Perl. This allows you encryption things like credit card numbers, bank routing information, or that winning PowerBall number in a unattended fashion. Typically, you would use this in a situation where a user needs to enter sensitive information into a form which need to be stored in a secure manner. We can do this with Ruby (on Rails) as well, and it’s even easier.
First we need to generate a key pair. This creates two keys, a public key which will only be used to encrypt data, and a private key, which will only be used to decrypt data. The private key is protected by a password know only to us. When it comes to choosing strong passwords, I suggest using Diceware. 2048 is the key size in bits. Bigger is better, but also slower; 2048 is considered a good trade off between speed and encryption strength. We are also limited by this to encrypting as most 2048 bits, more on this below.
1 2 3 4 5 6 7
Then we extract the public key:
1 2 3
Once we have the keys, we can encrypt data using the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Simply, public_key_file is path to the file containing the public key, and string is the string to encrypt. We open the public key and then use public_encrypt to encrypt it. Because the encrypted string is binary I have converted to text using Base64. If your are storing the encrypted string in a database that can hold binary data, you could change:
Now that we have encrypted data, we’ll want to be able to get it back.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Here private_key_file is path to the file containing the private key, password and encrypted_string is the string to decrypt. In a real application you would not want to hard-code the password, rather you should prompt for it in some way.
Again we are using Base64 to make the encrypted string human readable. If this is not necessary, change:
As noted above, you can not use this method to encrypt anything larger than the key size minus 11 bytes of overhead (padding). In this case we have a 2048 bit key which gives 256 - 11 = 245 bytes. The temptation is to increase the key size to accommodate more data, but this quickly become to slow to be useful. The correct way to accomplish this is to use public key encryption to encrypt random password, which, in turn is used to encrypt the data using symmetric-key encryption. I’ll cover this next time.
It’s not uncommon to have information submitted through a web form that you need to save, but don’t want to have lying around in plain text. Credit card numbers, bank routing information, missile launch codes, and so on. The trick is to do this in a unattended fashion; you don’t want to have the person submitting the form do anything special such as supply a password. Enter public key encryption.
In public key encryption there are two passwords, or keys, one which is used to encrypt information and one which is used to decrypt (there are additional ways to use the key pair, but that’s a topic for another day). Since the encryption key can not be used to decrypt sensitive data, can safely be made public. So in the case of a web form, we can make the public key available to our CGI, which protecting the private key for our use only.
The first step is to generate a key pair and password protect the private key. For this we’ll use OpenSSL which comes pre-installed on just about every Unix-like system (including OS X). OpenSSL provides a wide range of cryptographic functions including an implementation of the RSA public key encryption algorithm.
First we generate the private key. “2048” is the size of the key in bits, and, in this simple example, it controls the maximum number of bits we can encrypt . For more security, at the cost of more processor overhead, you can increase size, but you shouldn’t use a smaller number. We’ll need a password; I like to use Diceware, but you can generate it any way you like.
1 2 3 4 5 6 7
Then we extract the public key:
1 2 3
Now a bit of code to encrypt a string using the public key:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
The function encryptPublic takes the path of the public key and a string to encrypt and returns the encrypt string. Because the encrypted string is binary it’s converted to text using Base64 to make it easier to handle. This is certainly not necessary, and if you were storing the string in a database that has a binary type you could change the last line of the function to:
The code requires the CPAN module “Crypt::OpenSSL::RSA” which is a wrapper around the OpenSSL libraries.
Now to decrypt:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
Here we have the function decryptPrivate which takes the path of the private key file, the private key password, and the (Base64 encoded) encrypted string and returns the decrypted string. The process is a bit more complex than encryption and, do to the limitations of “Crypt::OpenSSL::RSA”, we have to use an additional CPAN module, “Convert::PEM”.
Crypt::OpenSSL::RSA” lacks the ability to unlock (decrypt) the private key. It’s decrypt function expects to receive an already decrypted copy of the key. Fortunately for us “Convert::PEM” can decrypt the private key and return it in a format we can use.
As with encryption, you do not need to use Base64 encoded strings. Simply replace the line:
On key size: As I said above the amount of data you can encrypt this way is limited by the key size minus 11 bytes of overhead (padding), here a 2048 bit key gives us 256 - 11 = 245 bytes. You could handle larger data by increasing the key size, but that would entail a potentially large performance hit and is not how key pairs are used. Instead you would generate a random password, use it to encrypt the data using symmetric-key encryption such as Triple DES or Blowfish, then use the public key to encrypt and store the random password.
One last note; there is another Perl Module “Crypt::RSA” which is a purl perl implementation of RSA public key encryption. On the plus side, it doesn’t require OpenSSL be installed and it has a much more complete API, including better key handling. On the minus side, while fast for perl, it’s considerably slower than OpenSSL and can not take advantage of encryption hardware; something that OpenSSL automatically does.