Encryption of Graphics Files

Cryptography is the technology of keeping information secret. In this context, we define secret as "being protected from unauthorized access and attack." Although you may not think of your graphics files or their contents as ever being under attack, you may want to keep the information contained in these files from being copied or viewed by unauthorized people or computers. If copies of the files are freely available, the only way to keep the files secret is to encrypt them.

Cryptography may seem to be a black art requiring extremely complex mathematics and access to supercomputers. This may be the case for professional cryptanalysts (codebreakers). But for ordinary people who need to protect data, cryptography can be a strong, often simple to use, and sometimes freely available tool.

This section doesn't try to explain cryptography, nor the details of particular cryptosystems. (Refer to the books in the "For Further Information" section for basic information in this area.) Instead, we will look at why you, as an author, archiver, transporter, or user of graphics files, may need to encrypt graphics files.

First, let's look at the general problem of protecting graphics files.

Protecting Graphics Files

Many businesses and organizations, such as those associated with medical and document imaging, worry about users (most often programmers) finding ways to alter the contents of their graphics and image files. Such alterations might be used to change an X-ray photograph, discover a debit card number, or forge a handwritten signature on the bitmap of a check. In addition to the human consequences, such actions may result in lawsuits against the company whose process or equipment originally created the graphics file--not a pleasant prospect for any corporation.

How can you protect against alterations of this kind?

Physical protection

The initial defense against unauthorized alterations is to deny physical access to your graphics files. How would you do this? When a new image is saved, store it in a file that is physically and directly inaccessible to the user. When an image is needed, decrypt (and possibly decompress) it, then display or print it, but never make the file itself directly accessible.

Unfortunately, this approach is not feasible for operations requiring that the image data be transmitted along unsecured channels, or possibly stored on a unsecured system. In such cases, we must instead find a way to make the contents of our graphics files secure while allowing the files to be stored on any computer system, even one that is not secure.

Proprietary file formats

Files can be difficult, if not impossible, to read if you do not know the format in which the file's data is stored. If we use a well-published graphics file format to store our data, a programmer could trivially write a program that makes unauthorized use of our data. So, it would seem that creating a proprietary graphics file format, one whose internals are kept secret, would go a long way toward protecting our data. The problem with this method is that the skill and determination of the people who want to view and alter your graphics files may be much greater than your own.

Even general information can be enough on which to base an attack. For example, somebody who knows that your file is a bitmap will also know that the vast majority of bitmap files contain a fixed-size header followed by the image data. All bitmap headers contain typical information, such as the size of the bitmap and the number of bits per pixel. Inventing a header from scratch (or worse, basing your new format on an already well-known format) and following it with data encoded using a conventional data compression scheme (or worse, no compression at all) will not slow down a determined file format cracker for very long. The cracking of Kodak's Photo CD format, for example, came about through the exploitation of a very few bits of general information.

We can assume that the most useful contents of even an unknown graphics file header can be deduced. What about the image data itself? Bitmap data is usually stored as pixels that are 1, 8, or 24 bits in size. The pixel values are typically indexed using a color table, or stored directly using a three-color model, such as RGB or CMY. Only a few formats store bitmap data otherwise, so discovering the format of the bitmap may be just as easy as deducing the contents of the header.

Proprietary compression methods

What about using compression as a means of obscuring data? Certainly a bitmap can be compressed, and compression does obscure the apparent format of the bitmap. But once the starting byte of the bitmap is identified, it is a simple matter to try any of a dozen or so common algorithms to decompress the data. It may even be possible to identify the type of compression method used by looking at a hex dump of a section of the bitmap that is a single color. And, although the scan lines of the bitmap may not be stored in sequential order (such as in interlaced GIF files), that will not prevent the method of compression from being discovered.

One way to get around this is to use a compression method that a file format cracker won't have in her toolbox. Unfortunately, data compression algorithms are designed only to make your data physically smaller, and not to secure your data from prying eyes (or in-circuit emulators). Of course, a few unpublished, proprietary methods of compression, such as fractal compression, are also considered forms of data encryption because the details are kept secret. These methods derive their security from their complexity, however, not from the use of robust algorithms. It's only a matter of time before they are cracked. Also, note that these are extremely complex encoding algorithms; if you don't know how to decompress the data, you certainly can't use it in a practical way.

It is doubtful that the average business--or even the above-average programmer--has the resources to invent a method of compression that is radically different from, yet just as good as, JPEG, LZW, or CCITT Group 4. You can alter an existing, published compression algorithm to "break" it in an undocumented way, thereby rendering it undecodable by conventional means. However, doing so risks decreasing the efficiency of the compression method and leaving your files larger in size, and possibly slower to decompress, than they otherwise would be. You also risk giving yourself a false sense of security.

In summary, proprietary file formats and bizarre compression methods are not secure methods of protecting your data. We recommend that you consider encryption instead.

Why Encrypt Graphics Files?

There are a number of possible reasons why you might want to encrypt your graphics files:

To hide the contents of the file from unauthorized users. Most people who are interested in file encryption will answer this way. Suppose that you have some drawings or images that you want to keep most people from seeing. Simply hiding the files does not provide enough security for your purposes. Or, you may want a second line of defense in case somebody does discover your hidden files. Often, keeping your graphics files inaccessible by locking them up isn't an option anyway, because the files must be distributed.
To prevent the file from being illegally copied. Encryption schemes can't actually prevent files from being copied, but they can help ensure that unauthorized copies of files are useless to the person who copies them. Files distributed on a CD-ROM are often encrypted; they are useless until a key is used to decrypt them. The key is usually given to the user when he registers the product. Encrypted files transmitted via modem are still vulnerable to wiretapping, but without the key they are useless to the wiretapper. (Of course, remember that if the key is discovered as well, the files may be decrypted.)
To disguise the type of file. This case is a bit esoteric, but it's possible that you may not want others to know what kind of data the file contains. Obscuring not only the file's data, but also the type of data the file contains, adds a small bit of additional difficulty to cracking the file via certain cryptanalytic attacks.
To detect whether a file is corrupt. Corruption of encrypted files is easily detected when the files are decrypted. Any alteration in the encrypted data, even to the extent of changing only a single byte, causes the decryption process to fail. However, this method can only indicate that the file is corrupt, but not specifically where in the file the corruption occurred. For this purpose, digital signatures or simple checksums work better.
Corruption within a file can also be detected when a file is decompressed. Most graphics files, however, are written in a format that supports only the compression of bitmap data. Corrupt data in uncompressed parts of the file, such as the color tables and the header, would go undetected. Many formats (e.g., BMP) routinely store bitmap data uncompressed; others (e.g., most vector formats) do not support compression at all.
While you may opt for a simpler method of data error detection in your graphics files (such as storing your files using an archiving utility that supports error detection, as we discuss earlier in this chapter in the section called "Corruption of Graphics Files"), you might also find encryption helpful for this purpose, especially if you are not at liberty to alter the file's contents.
To prevent the contents of the file from being altered. Encryption cannot actually prevent a file from being modified; only the security of a file storage system can do that. But, as we've discussed, encrypted files that have been changed are easily detected. Encrypting your files and publicizing this fact may also provide a deterrent that keeps unauthorized people from attempting to modify your graphics files, or even attempting to locate your hidden files in the first place. Of course, advertising such a fact may present a clear challenge to certain people. (After all, if your program can decode your data, then such people may have a shot at cracking your encryption by examining the internals of your software.)
To identify the person or program that created the file. Encryption can also be used to create a digital signature or "fingerprint" associated with a particular file. A digital signature not only verifies the person or program that created the file, but can also include a time stamp of when the signature was created. Most encryption systems allow you to generate a digital signature without actually encrypting the data.

Pros and Cons of Cryptography

Before we look in any more detail at what cryptography is, let's discuss what it is not.

Misconceptions about cryptography

Cryptography is not a form of data compression. Many cryptographic systems do use data compression algorithms to compress your data before they encrypt it. However, this step is often performed not only to reduce the physical size of the data, but also to remove redundancies in the data that might make the file easier to crack. This is the reason why many encrypted files are physically smaller than the original unencrypted files.

Cryptography is not copy protection. Encrypted files may be copied from CD-ROMs, floppies, and hard disks just as easily as any other files can be. Copy protection schemes usually physically alter the media the files are stored on, or format the media in some non-standard way (such as Microsoft's DMF disk format). And while some copy protection schemes may make use of file encryption, it is not the encryption itself that prevents the files from being copied. In this case, encryption can only ensure that if a file is copied, that file will be unusable.

Benefits of cryptography

What are the benefits of using cryptography?

Cryptography is specifically designed to allow you to get at your data while preventing unauthorized people from doing so as well.
Encryption allows you to work with standard graphics files as your unencrypted data. This eases the burden on the authorized users of the file. It also releases you from the foolish and quixotic task of inventing yet another proprietary file format. Inventing a proprietary format will add nothing to the security of your encrypted file anyway.
The encrypted files need not be hidden or copy-protected to be secure. If you choose, the file may be freely available for copying and access.
Encryption algorithms are much harder to crack than either proprietary file formats or data compression algorithms. It is also less likely that unauthorized people will even attempt to obtain your files if they know that they are encrypted.
Your files are secure for as long as the encryption algorithm itself remains unbroken and your decryption keys remain undiscovered.

These are all very good reasons to use cryptographic technology to secure your data. Are there any reasons not to use encryption? What problems could there be with encrypting your graphics files? To answer this we have to briefly look at two basic systems of cryptography called private key cryptography and public key cryptography.

Private and public key cryptography

If you have ever used a secret password to log into a computer system or network, withdraw money from an automatic teller machine, or gain entrance to a secret meeting, you have probably used a private key cryptographic system.

Private key systems use a single password or pass phrase to both encrypt and decrypt a piece of information. Both the person encrypting the information and all of the people authorized to decrypt the information must have the key. If an unauthorized person acquires both the encrypted information and the key, and if they know the encryption algorithm that is being used, they can easily access the your information.

Public key systems use an algorithm to generate two mathematically related keys. Data encrypted with one key (the public key) can only be decrypted with the other key (the secret key) and the secret key's pass phrase. The data is secure if no one else has a copy of your secret key and knows your key's pass phrase (and if you don't leave decrypted copies of your files around).

Public key systems have some tremendous advantages over private key systems. Suppose you transmit some private-key-encrypted files to a friend across the country. Your friend can't decrypt these files until you also give her the same password you used to encrypt them. But, how do you securely tell her what the password is? If the password is intercepted along with your files, your data will be in unauthorized hands.

If, on the other hand, you use a public key system, you only need your friend's freely available public key to encrypt the files. Once your friend receives the files, she uses her own secret key and her secret key's pass phrase to decrypt them. No secret pass phrase need be sent with the files. Moreover, your friend's public key (and indeed your own) need not be hidden. In fact, you want as many people to have your public key as possible so they will be able to send encrypted files to you!

Risks of cryptography

Cryptography is not without its risks and problems:

The more people who know your password, or pass phrase, and have access to your secret keys, the less secure your data is. This is even more true if your encrypted data is easily available and the method of encryption is widely known.
Even if your keys and pass phrases are secure, your chosen method of encryption may not be. In theory, for every data encryption algorithm invented, at least one method of getting around it exists, although it may take hundreds of computers and many years to fully exploit that method.
So too, any given implementation of a cryptographic system may not be ideally secure. Even the most theoretically robust of encryption methods, if implemented using bad design and with buggy code, can be insecure.
Encryption always imposes a performance penalty. Public key systems are especially slow to encrypt and decrypt data. You may decide that your application can't afford the extra overhead to use a particular method of encryption.
Almost all forms of encryption systems are patented. Even those that are freely available still require a license for commercial use.
Many import and export restrictions also exist that could make it more difficult to distribute and sell your software.[1]

[1] (Simson Garfinkel's PGP: Pretty Good Privacy, referenced in "For Further Information" below, contains a complete discussion of patent and export issues.)

Assuming that you can live with these shortcomings of encryption systems (or perhaps you simply want to play around with some of the technology), how can you actually encrypt your files? One tool that's easy to get and use is PGP (Pretty Good Privacy).

Using PGP to Encrypt Graphics Files

PGP is a robust public key system, invented by Phil Zimmermann, which is used to securely encrypt and decrypt files. PGP is also a customizable software tool that is capable of creating and managing public and private keys, creating digital signatures, and being integrated into software programs, such as text and graphics file editors and email applications. PGP is also freely available for noncommercial use.

How would you use PGP to encrypt a graphics file? This depends upon whether you wish the resulting encrypted data to be stored as binary or ASCII data. Using the following command, PGP will store encrypted data to a file as binary data:

    unix% pgp -c private.gif

This command causes PGP to read the file private.gif and to create a new file called private.pgp. The contents of private.pgp are a spew of binary data that is an encrypted representation of the data stored in private.gif. If you look at private.pgp in a text editor, you will see unintelligible binary data.

If you prefer that the encrypted data be stored in ASCII character format, PGP can produce the encrypted file using ASCII character data:

    unix% pgp -ca private.gif

PGP now creates a file called private.asc. This file contains an encrypted version of private.gif using only ASCII characters. If you look at private.asc via a text editor, you will see something that looks like this:

Instead of binary gibberish, you will see your encrypted file encoded as ASCII data using the radix-64 binary-to-ASCII encoding algorithm. Using this algorithm, an encrypted file stored in ASCII format is one-third larger than the same file encrypted and stored using the binary format. This is because radix-64 encodes every three binary characters as four ASCII characters. This result is well-known to anyone who has used the uuencode program native to UNIX.

What does this mean to you? You can store your graphics files as binary or ASCII data. The contents of the files will be hidden. In fact, anyone looking at the encrypted data won't be able to tell what kind of files they are. If the files are altered or corrupted in any way, they will not properly decode. And without the matching secret key and pass phrase, the file's contents cannot be successfully decrypted.

This way, you can keep people from discovering a file's contents, detect whether an encrypted file has been changed, and even hide the type of data a file contains. But how do you use PGP to verify who created the file? PGP does this by using a digital signature.

A digital signature is a numerical value created using an algorithm known as a message digest function. This function reads your file data as input and generates a value that is unique to the data in the file. Changing so much as one bit in the file will cause the digital signature to be different.

Digital signatures are typically used to authenticate the sender of an email message. A unique digital signature is created from the file data, encrypted with the private key of the sender, and appended to the message. The message receiver then obtains the public key of the sender, decrypts the digital signature, and calculates the digital signature of the message. If the calculated digital signature matches the signature included in the message, the receiver may safely assume that the message was sent by the owner of the public key.

You can take any graphics file and sign it using PGP. The signature may be used to verify the creator (human or computer) of the file, and the file's contents need not be encrypted. Here's how you would do it using PGP:

    unix% pgp -s private.gif

This command signs the file private.gif using your secret key and pass phrase. The encrypted signature is appended to the data and stored in the file private.pgp. The data of the file is not encrypted. If you want the file's data to be encrypted, use the following command instead:

    unix% pgp -se private.gif

This command encrypts the file data, signs the file contents, and also places the data in the file private.pgp. The signature is a binary value by default. If you want an ASCII signature, specify:

    unix% pgp -sea private.gif

This command creates a file called private.asc, which contains the encrypted data and an appended ASCII signature. If the input file contains text rather than binary data, you add yet another flag:

    unix% pgp -seat private.dxf

NOTE:
PGP is a self-contained software program that has been ported to many different computers and operating systems. PGP is typically used from the command line. A software application using PGP version 2.x could also use PGP as a separate executable program. PGP version 3.0 promises an API library of tools that may be directly linked into software applications.

See the PGP reference in the section below for complete information about this program and all its options.

For Further Information About Encryption

The Internet abounds with information about data encryption. A Web search on the keywords "encryption" and "cryptography" will turn up hundreds of hits.

Two major sources of information and references are the USENET newsgroups alt.security.pgp and sci.crypt and their associated FAQs.

In addition, see the following references for information about encryption:

Garfinkel, Simson, PGP: Pretty Good Privacy, O'Reilly & Associates, Sebastopol, CA, 1995.
Schneier, Bruce, Applied Cryptography: Protocols, Algorithms, and Source Code in C, John Wiley & Sons, New York, NY, 1994.
Schneier, Bruce, Practical Cryptography, John Wiley & Sons, New York, NY, 1994.
Schneier, Bruce, "Untangling Public Key Cryptography," Dr. Dobb's Journal, May 1992, pp. 16-28.
Schneier, Bruce, "The IDEA Encryption Algorithm," Dr. Dobb's Journal, December 1993, pp. 50-56.
Stevens, A., "Hacks, Spooks and Data Encryption," Dr. Dobb's Journal, September 1990, pp. 127-134, 147-149.