Elements of a Graphics File

As mentioned in the Preface, different file format specifications use different terminology. In fact, it's possible that not a single term has a common meaning across all of the file formats mentioned in this book. This is certainly true for terms referring to the way data is stored in a file--terms such as field, tag, block, and packet. In fact, a specification will sometimes provide a definition for one of these terms and then abandon it in favor of a more descriptive one, such as a chunk, sequence, or record.

For purposes of discussion in this book, we will consider a graphics file to be composed of a sequence of data and data structures, called file elements or data elements. These are divided into three categories: fields, tags, and streams.

Fields

A field is a data structure that is a fixed size in a graphics file. A fixed field has not only a fixed size but a fixed position within the file. The location of a field is communicated by specifying either an absolute offset from a landmark in a file, such as the file's beginning or end, or a relative offset from some other data. The size of a field either is stated in the format specification or can be inferred from other information.

Streams

Fields and tags are an aid to random access; they're designed to help a program quickly access a data item known in advance. Once a position in a file is known, a program can access the position directly without having to read intervening data. A file that organizes data as a stream, on the other hand, lacks the structure of one organized into fields and tags and must be read sequentially. For our purposes, we will consider a stream to be made up of packets, which can vary in size, are sub-elements of the stream, and are meaningful to the program reading the file. Although the beginning and end of the stream may be known and specified, the location of packets other than the first usually is not, at least prior to the time of reading.

Combinations of Data Elements

You can imagine, then, pure fixed field files, pure tag files, and pure stream files, made up entirely of data organized into fixed fields, tags, and streams, respectively. Only rarely, however, does a file contain data elements of a single type; in most cases it is a combination of two or more. The TIFF and TGA formats, for example, use both tags and fixed fields. GIF format files, on the other hand, use both fixed fields and streams.

Fixed-field data is usually faster and easier to read than tag or stream data. Files composed primarily of fixed-field data, however, are less flexible in situations in which data needs to be added to or deleted from an existing file. Formats containing fixed fields are seldom easily upgraded. Stream data generally requires less memory to read and buffer than field or tag data. Files composed primarily of stream data, however, cannot be accessed randomly, and thus cannot be used to find or sub-sample data quickly. These considerations are discussed further in Chapters 3, 4, and 5.