This chapter provides a good deal of somewhat loosely connected information about graphics files, file formats, and file format specifications. We discuss the issues you'll confront when you attempt to read and write graphics files (including examples of code you can use in your own programs). We also describe the use of test files, the corruption and encryption of graphics files, the potential for virus infection in these files, and the issues involved in developing your own file format and in writing the specification for that format, including ways of copyrighting and otherwise protecting your files and file formats.
Contents:
Reading Graphics Data
Writing Graphics Data
Test Files
Corruption of Graphics Files
Encryption of Graphics Files
Viruses in Graphics Files
Designing Your Own Format
Writing a File Format Specification
Trademarks, Patents, and Copyrights
A graphics format file reader is responsible for opening and reading a file, determining its validity, and interpreting the information contained within it. A reader may take, as its input, source image data either from a file or from the data stream of an input device, such as a scanner or frame buffer.
There are two types of reader designs in common use. The first is the filter. A filter reads a data source one character at a time and collects that data for as long as it is available from the input device or file. The second type is the scanner (also called a parser). Scanners are able to randomly access across the entire data source. Unlike filters, which cannot back up or skip ahead to read information, scanners can read and reread data anywhere within the file. The main difference between filters and scanners is the amount of memory they use. Although filters are limited in the manner in which they read data, they require only a small amount of memory in which to perform their function. Scanners, on the other hand, are not limited in how they read data, but as a tradeoff may require a large amount of memory or disk space in which to store data.
Because most image files are quite large, make sure that your readers are highly optimized to read file information as quickly as possible. Graphics and imaging applications are often harshly judged by users based on the amount of time it takes them to read and display an image file. One curiosity of user interface lore states that an application that renders an image on an output device one scan line at a time will be perceived as slower than an application that waits to paint the screen until the entire image has been assembled, even though in reality both applications may take the same amount of time.
Binary Versus ASCII Readers Binary image readers must read binary data written in 1-, 2-, and 4-byte word sizes and in one of several different byte-ordering schemes. Bitmap data should be read in large chunks and buffered in memory for faster performance, as opposed to reading one pixel or scan line at a time.
ASCII format image readers require highly optimized string reader and parser functions capable of quickly finding and extracting pertinent information from a string of characters and converting ASCII strings into numerical values.
The type of code you use to implement a reader will vary greatly, depending upon how data is stored in a graphics file. For example, PCX files contain only little-endian binary data; Encapsulated PostScript files contain both binary and ASCII data; TIFF files contain binary data that may be stored in either the big- or little-endian byte order; and AutoCAD DXF files contain only ASCII data.
Many graphics files that contain only ASCII data may be parsed one character at a time. Usually, a loop and a series of rather elaborate nested case statements are used to read through the file and identify the various tokens of keywords and values. The design and implementation of such a text parser is not fraught with too many perils.
Where you can find some real gotchas is in working with graphics files containing binary data, such as the contents of most bitmap files. A few words of advice are in order, so that when you begin to write your own graphics file readers and writers you don't run into too many problems (and inadvertently damage your fists with your keyboard!).
When you read most bitmap files, you'll find that the header is the first chunk of data stored in the file. Headers store attributes of the graphics data that may change from file to file, such as the height and width of the image and the number of colors it contains. If a format always stored images of the same size, type, and number of colors, a header wouldn't be necessary. The values for that format would simply be hard-coded into the reader.
As it is, most bitmap file formats have headers, and your reader must know the internal structure of the header of each format it is to read. A program that reads a single bitmap format may be able to get away with seeking to a known offset location and reading only a few fields of data. However, more sophisticated formats containing many fields of information require that you read the entire header.
Using the C language, you might be tempted to read in a header from a file using the following code:
typedef struct _Header { DWORD FileId; BYTE Type; WORD Height; WORD Width; WORD Depth; CHAR FileName[81]; DWORD Flags; BYTE Filler[32]; } HEADER; HEADER header; FILE *fp = fopen("MYFORMAT.FOR", "rb"); if (fp) fread(&header, sizeof(HEADER), 1, fp);
Here we see a typical bitmap file format header defined as a C language structure. The fields of the header contain information on the size, color, type of image, attribute flags, and name of the image file itself. The fields in the header range from one to 80 bytes in size, and the entire structure is padded out to a total length of 128 bytes.
The first potential gotcha may occur even before you read the file. It lies waiting for you in the fopen() function. If you don't indicate that you are opening the graphics file for reading as a binary file (by specifying the "rb" in the second argument of the fopen() parameter list), you may find that extra carriage returns and/or linefeeds appear in your data in memory that are not in the graphics file. This is because fopen() opens files in text mode by default.
In C++, you need to OR the ios::binary value into the mode argument of the fstream or ifstream constructor:
fstream *fs = new fstream ("MYFORMAT.FOR", ios::in|ios::binary);
After you have opened the graphics file successfully, the next step is to read the header. The code we choose to read the header in this example is the fread() function, which is most commonly used for reading chunks of data from a file stream. Using fread(), you can read the entire header with a single function call. A good idea, except that in using fread() you are likely to encounter problems. You guessed it, the second gotcha!
A common problem you may encounter when reading data into a structure is that of the boundary alignment of elements within the structure. On most machines, it is usually more efficient to align each structure element to begin on a 2-, 4-, 8-, or 16-byte boundary. Because aligning structure elements is the job of the compiler, and not the programmer, the effects of such alignment are not always obvious.
The compiler word-aligns structure elements in the same way. By adding padding, we increased the length of the header so it ends on a 128-byte boundary. Just as we added padding at the end of the header, compilers add invisible padding to structures to do the following:
The padding takes the form of invisible elements that are inserted between the visible elements the programmer defines in the structure. Although this invisible padding is not directly accessible, it is as much a part of the structure as any visible element in the structure. For example, the following structure will be five, six, or eight bytes in size if it is compiled using a 1-, 2-, or 4-byte word alignment:
typedef struct _test { BYTE A; /* One byte */ DWORD B; /* Four bytes */ } TEST;
With 1-byte alignment, there is no padding, and the structure is five bytes in size, with element B beginning on an odd-byte boundary. With 2-byte alignment, one byte of padding is inserted between elements A and B to allow element B to begin on the next even-byte boundary. With 4-byte alignment, three bytes of padding are inserted between A and B, allowing element B to begin on the next even-word boundary.
Determining the Size of a Structure At runtime, you can use the sizeof() operator to determine the size of a structure:
typedef struct _test { BYTE A; DWORD B; } TEST; printf("TEST is %u bytes in length\n", sizeof(TEST));Because most ANSI C compilers don't allow the use of sizeof() as a preprocessor statement, you can check the length of the structure at compile time by using a slightly more clever piece of code:
/* ** Test if the size of TEST is five bytes or not. If not, the array ** SizeTest[] will be declared to have zero elements, and a ** compile-time error will be generated on all ANSI C compilers. ** Note that the use of a typedef causes no memory to be allocated ** if the sizeof() test is true. And please, document all such ** tests in your code so other programmers will know what the heck ** you are attempting to do. */ typedef char CHECKSIZEOFTEST[sizeof(TEST) == 5];
The gotcha here is that the fread() function will write data into the padding when you expected it to be written to an element. If you used fread() to read five bytes from a file into our 4-byte-aligned TEST structure, you would find that the first byte ended up correctly in element A, but that bytes 2, 3, and 4 were stored in the padding and not in element B as you had expected. Element B will instead store only byte 5, and the last three bytes of B will contain garbage.
There are several steps involved in solving this problem.
First, attempt to design a structure so each field naturally begins on a 2-byte (for 16-bit machines) or 4-byte (for 32-bit machines) boundary. Now if the compiler's byte-alignment flag is turned on or off, no changes will occur in the structure.
When defining elements within a structure, you also want to avoid using the INT data type. An INT is two bytes on some machines and four bytes on others. If you use INTs, you'll find that the size of a structure will change between 16- and 32-bit machines even though the compiler is not performing any word alignment on the structure. Always use SHORT to define a 2-byte integer element and LONG to specify a four-byte integer element, or use WORD and DWORD to specify their unsigned equivalents.
When you read an image file header, you typically don't have the luxury of being the designer of the file's header structure. Your structure must exactly match the format of the header in the graphics file. If the designer of the graphics file format didn't think of aligning the fields of the header, then you're out of luck.
Second, compile the source code module that contains the structure with a flag indicating that structure elements should not be aligned (/Zp1 for Microsoft C++ and -a1 for Borland C++). Optionally, you can put the #pragma directive for this compiler option around the structure; the result is that only the structure is affected by the alignment restriction and not the rest of the module.
This, however, is not a terribly good solution. As we have noted, by aligning all structure fields on a 1-byte boundary, the CPU will access the structure data in memory less efficiently. If you are reading and writing the structure only once or twice, as is the case with many file format readers, you may not care how quickly the header data is read.
You must also make sure that whenever anybody compiles your source code, they use the 1-byte structure alignment compiler flag. Depending on which machine is executing your code, failure to use this flag may cause problems reading image files. Naming conventions may also differ for #pragma directives between compiler vendors; on some compilers, the byte-alignment #pragma directives might not be supported at all.
Finally, we must face the third insidious gotcha--the native byte order of the CPU. If you attempt to fread() a graphics file header containing data written in the little-endian byte order on a big-endian machine (or big-endian data in a file on a little-endian machine), you will get nothing but byte-twiddled garbage. The fread() function cannot perform the byte-conversion operations necessary to read the data correctly, because it can only read data using the native byte order of the CPU.
At this point, if you are thinking, "But, I'm not going to read in each header field separately!" you are in for a rather rude change of your implementation paradigm!
Reading each field of a graphics file header into the elements of a structure, and performing the necessary byte-order conversions, is how it's done. If you are worried about efficiency, just remember that a header is usually read from a file and into memory only once, and you are typically reading less than 512 bytes of data--in fact, typically much less than that. We doubt if the performance meter in your source code profiler will show much of a drop.
So, how do we read in the header fields one element at a time? We could go back to our old friend fread():
HEADER header; fread(&header.FileId, sizeof(header.FileId), 1, fp); fread(&header.Height, sizeof(header.Height), 1, fp); fread(&header.Width, sizeof(header.Width), 1, fp); fread(&header.Depth, sizeof(header.Depth), 1, fp); fread(&header.Type, sizeof(header.Type), 1, fp); fread(&header.FileName, sizeof(header.FileName), 1, fp); fread(&header.Flags, sizeof(header.Flags), 1, fp); fread(&header.Filler, sizeof(header.Filler), 1, fp);
While this code reads in the header data and stores it in the structure correctly (regardless of any alignment padding), fread() still reads the data in the native byte order of the machine on which it is executing. This is fine if you are reading big-endian data on a big-endian machine, or little-endian data on a little-endian machine, but not if the byte order of the machine is different from the byte order of the data being read. It seems that what we need is a filter that can convert data to a different byte order.
If you have ever written code that diddled the byte order of data, then you have probably written a set of SwapBytes functions to exchange the position of bytes with a word of data. Your functions probably looked something like this:
/* ** Swap the bytes within a 16-bit WORD. */ WORD SwapTwoBytes(WORD w) { register WORD tmp; tmp = (w & 0x00FF); tmp = ((w & 0xFF00) >> 0x08) | (tmp << 0x08); return(tmp); } /* ** Swap the bytes within a 32-bit DWORD. */ DWORD SwapFourBytes(DWORD w) { register DWORD tmp; tmp = (w & 0x000000FF); tmp = ((w & 0x0000FF00) >> 0x08) | (tmp << 0x08); tmp = ((w & 0x00FF0000) >> 0x10) | (tmp << 0x08); tmp = ((w & 0xFF000000) >> 0x18) | (tmp << 0x08); return(tmp); }
Because words come in two sizes, you need two functions: SwapTwoBytes() and SwapFourBytes()--for those of you in the C++ world, you'll just write two overloaded functions, or a function template, called SwapBytes(). Of course you can swap signed values just as easily by writing two more functions that substitute the data types SHORT and LONG for WORD and DWORD.
Using our SwapBytes functions, we can now read in the header as follows:
HEADER header; fread(&header.FileId, sizeof(header.FileId), 1, fp); header.FileId = SwapFourBytes(header.FileId); fread(&header.Height, sizeof(header.Height), 1, fp); header.Height = SwapTwoBytes(header.Height); fread(&header.Width, sizeof(header.Width), 1, fp); header.Width = SwapTwoBytes(header.Width); fread(&header.Depth, sizeof(header.Depth), 1, fp); header.Depth = SwapTwoBytes(header.Depth); fread(&header.Type, sizeof(header.Type), 1, fp); fread(&header.FileName, sizeof(header.FileName), 1, fp); fread(&header.Flags, sizeof(header.Flags), 1, fp); header.Flags = SwapFourBytes(header.Flags); fread(&header.Filler, sizeof(header.Filler), 1, fp);
We can read in the data using fread() and can swap the bytes of the WORD and DWORD-sized fields using our SwapBytes functions. This is great if the byte order of the data doesn't match the byte order of the CPU, but what if it does? Do we need two separate header-reading functions, one with the SwapBytes functions and one without, to ensure that our code will work on most machines? And, how do we tell at runtime what the byte order of a machine is? Take a look at this example:
#define LSB_FIRST 0 #define MSB_FIRST 1 /* ** Check the byte-order of the CPU. */ int CheckByteOrder(void) { SHORT w = 0x0001; CHAR *b = (CHAR *) &w; return(b[0] ? LSB_FIRST : MSB_FIRST); }
The function CheckByteOrder() returns the value LSB_FIRST if the machine is little-endian (the little end comes first) and MSB_FIRST if the machine is big-endian (the big end comes first). This function will work correctly on all big- and little-endian machines. Its return value is undefined for middle-endian machines (like the PDP-11).
Let's assume that the data format of our graphics file is little-endian. We can check the byte order of the machine executing our code and can call the appropriate reader function, as follows:
int byteorder = CheckByteOrder(); if (byteorder == LSB_FIRST) ReadHeaderAsLittleEndian(); else ReadHeaderAsBigEndian();
The function ReadHeaderAsLittleEndian() would contain only the fread() functions, and ReadHeaderAsBigEndian() would contain the fread() and SwapBytes() functions.
But this is not very elegant. What we really need is a replacement for both the fread() and SwapBytes functions that can read WORDs and DWORDs from a data file, making sure that the returned data is in the byte order we specify. Consider the following functions:
/* ** Get a 16-bit word in either big- or little-endian byte order. */ WORD GetWord(char byteorder, FILE *fp) { register WORD w; if (byteorder == MSB_FIRST) { w = (WORD) (fgetc(fp) & 0xFF); w = ((WORD) (fgetc(fp) & 0xFF)) | (w << 0x08); } else /* LSB_FIRST */ { w = (WORD) (fgetc(fp) & 0xFF); w |= ((WORD) (fgetc(fp) & 0xFF) << 0x08); } return(w); } /* ** Get a 32-bit word in either big- or little-endian byte order. */ DWORD GetDword(char byteorder, FILE *fp) { register DWORD w; if (byteorder == MSB_FIRST) { w = (DWORD) (fgetc(fp) & 0xFF); w = ((DWORD) (fgetc(fp) & 0xFF)) | (w << 0x08); w = ((DWORD) (fgetc(fp) & 0xFF)) | (w << 0x08); w = ((DWORD) (fgetc(fp) & 0xFF)) | (w << 0x08); } else /* LSB_FIRST */ { w |= (DWORD) (fgetc(fp) & 0xFF); w |= (((DWORD) (fgetc(fp) & 0xFF)) << 0x08); w |= (((DWORD) (fgetc(fp) & 0xFF)) << 0x10); w |= (((DWORD) (fgetc(fp) & 0xFF)) << 0x18); } return(w); }
The GetWord() and GetDword() functions will read a word of data from a file stream in either byte order (specified in their first argument). Valid values are LSB_FIRST and MSB_FIRST.
Now, let's look at what reading a header is like using the GetWord() and GetDword() functions. Notice that we now read in the single-byte field Type using fgetc() and that fread() is still the best way to read in blocks of byte-aligned data:
HEADER header; int byteorder = CheckByteOrder(); header.FileId = GetDword(byteorder, fp); header.Height = GetWord(byteorder, fp); header.Width = GetWord(byteorder, fp); header.Depth = GetWord(byteorder, fp); header.Type = fgetc(fp); fread(&header.FileName, sizeof(header.FileName), 1, fp); header.Flags = GetDword(byteorder, fp); fread(&header.Filler, sizeof(header.Filler), 1, fp);
All we need to do now is to pass the byte order of the data being read to the GetWord() and GetDword() functions. The data is then read correctly from the file stream regardless of the native byte order of the machine on which the functions are executing.
The techniques we've explored for reading a graphics file header can also be used for reading other data structures in a graphics file, such as color maps, page tables, scan-line tables, tags, footers, and even pixel values themselves.
In most cases, you will not find any surprises when you read image data from a graphics file. Compressed image data is normally byte-aligned and is simply read one byte at a time from the file and into memory before it is decompressed. Uncompressed image data is often stored only as bytes, even when the pixels are two, three, or four bytes in size.
You will also usually use fread(), to read a block of compressed data into a memory buffer that is typically 8K to 32K in size. The compressed data is read from memory a single byte at a time, is decompressed, and the raw data is written either to video memory for display, or to a bitmap array for processing and analysis.
Many bitmap file formats specify that scan lines (or tiles) of 1-bit image data should be padded out to the next byte boundary. This means that if the width of an image is not a multiple of eight, then you probably have a few extra zeroed bits tacked onto the end of each scan line (or the end and/or bottom of each tile). For example, a 1-bit image with a width of 28 pixels will contain 28 bits of scan-line data followed by four bits of padding, creating a scan line 32 bits in length. The padding allows the next scan line to begin on a byte boundary, rather than in the middle of a byte.
You must determine whether the uncompressed image data contains scan-line padding. The file format specification will tell you if padding exists. Usually, the padding is loaded into display memory with the image data, but the size of the display window (the part of display memory actually visible on the screen) must be adjusted so that the padding data is not displayed.
An excellent article concerning the solution to byte-order conversion problems appeared in the now-extinct magazine, C Gazette. It was written by a fellow named James D. Murray!
Murray, James D., "Which Endian is Up?" C Gazette, Summer, 1990.
Copyright © 1996, 1994 O'Reilly & Associates, Inc. All Rights Reserved.