NCSA HDF Specifications
DRAFT

January 1993


University of Illinois at Urbana--Champaign


                          Introduction


Overview

The Hierarchical Data Format (HDF) was designed to make the sharing
of scientific data between different people, different projects,
and different types of computers easy and self-describing. An
extensible header, along with carefully crafted internal layers,
provides a system that can grow along with the software that NCSA
develops. This chapter provides a brief overview of HDF
capabilities and design.


Why HDF?

A fundamental requirement of scientific data management is the
ability to access as much information in as many ways and as
quickly and easily as possible. To make this possible, there needs
to be a data storage and retrieval system that facilitates these
capabilities. Specific needs of such a system include the
following.

*  Support for scientific data and metadata. Scientific data is
   characterized by a variety of different data types and
   representations, data sets (including images) that can be
   extremely large and complex, and the need to attach accompanying
   attributes, parameters, notebooks, and other metadata.

*  Support for a range of hardware platforms. Data can originate
   on one machine, only to be used later on many different
   machines. Scientists must be able to access data and metadata
   on as many hardware platforms as possible

*  Support for a range of software tools. Scientists need a variety
   of software tools and utilities for easily searching, analyzing,
   archiving, and transporting the data and metadata. These tools
   range from a library of routines for reading and writing data
   and metadata to small utilities that simply display an image on
   a console, to full-blown database retrieval systems that provide
   multiple views of thousands of sets of data and metadata.

*  Rapid data transfer. Both the size and the dispersion of
   scientific data sets require that mechanisms must exist to get
   the data from place to place rapidly.

*  Extendibility. As new types of information are generated and new
   kinds of science are done, a means must be provided to support
   them.


What is HDF?

The structure of HDF. HDF is a self-describing extensible file
format based on the use of tagged objects that have standard
meanings. The idea is to store both a known format description and
the data in the same file. HDF tags describe the format of the data
in the sense that each tag is assigned a specific meaning--one tag
is assigned to "Color Palette," another is assigned to "Raster
Image," and so on (see Figure 1). A program that has been written
to understand a certain list of tag types can scan the file for
those tag types and process the data. This program also can ignore
any data that is beyond its scope.

The set of available data objects encompasses both primary and
secondary data (metadata). Most HDF objects are machine- and
medium-independent, physical representations of data and metadata.

HDF Tags. HDF is designed with the assumption that we cannot know
a priori what types of data objects will be needed in the future,
nor can we know how scientists will want to view their data. As new
science is done, new types of data objects are needed, and new tags
must be created. In order to avoid unnecessary proliferation of
tags, and to insure that all tags are available to potential users
who need to share data, a portable public domain library is
available that interprets all public tags. The library contains
user interfaces designed to provide views of the data that are most
natural for users. As we learn more about the way scientists need
to view their data, we can add user interfaces that reflect data
models consistent with those views.

Types of data and structures. HDF currently supports the most
common types of data and metadata that scientists use, including
multidimensional gridded data, 2d and 3d raster images, polygonal
mesh data, multivariate datasets, sparse matrices, finite-element
data, splines, non-Cartesian coordinate data, and text. In the
future there will almost certainly be a need to incorporate new
types of data, such as voice and video, some of which might
actually be stored on other media than the central file itself. In
this sense, it may be desirable to employ the concept of a "virtual
file", which functions like a file, but doesn't fit our normal
notion of a file as a monolithic sequence of bits stored entirely
on a disk or tape somewhere.

HDF also makes it possible for the user to include annotations,
titles, and specific descriptions of the data in the file, so that
files can be archived with human-readable information about the
data and its origins.

One collection of HDF tags supports a hierarchical grouping
structure called vset that allows scientists to organize data
objects within HDF files to fit their views of how the objects go
together, much as a person in an office or laboratory organizes
information in folders, drawers, journal boxes, and on their
desktops.

*** INSERT FIGURE HERE ***

Backward and forward compatibility. An important goal of HDF is to
maximize backward and forward compatibility among its interfaces.
This is not always achievable, because changes sometimes have to
be made to the way data is organized in order to enhance
performance, to correct errors, or for other reasons. However,
whenever possible, HDF files should not become out of date. For
example, suppose a site falls far behind in the HDF standard, so
its users can only work with the portions of the specification that
are three years old. Users at this site might produce files with
their old HDF software, then read them with newer software designed
to work with more advanced data files. The newer software should
still be able to read the old files.

Conversely, if the site receives files that contain objects that
its HDF software does not understand, it should still be able to
list the types of data in the file, and it should still be able to
access all of the older types of data objects that it understands,
despite the fact that the older types of data objects are mixed in
with new kinds of data. In addition, if the more advanced site uses
the text annotation facilities of HDF effectively, the files will
arrive Appendix A, "NCSA HDF Tags," presents a list of brief
descriptions of the tags assigned at NCSA for general use.

Appendix B, "Header Files," includes the general header files used
in compiling all HDF libraries.


Form of Presentation

The material in this manual is presented in text or Presentation
screen displays.


Text

In explaining various features and commands, this manual often
presents a word within a paragraph in italics to indicate that the
word is defined within the paragraph.

Portions of this manual refer to other portions of the manual where
the other portions explain related topics. These cross references
usually mention the title of sections or chapters enclosed in
quotation marks, such as, See Chapter 1, "The Basic Structure of
HDF Files."


Screen Displays.

Screen displays in this manual are presented in Courier type.

long process of redesigning the lower layers of HDF began. As of
this writing, in Summer 1982, we are about to release the first
version of HDF that incorporates the new lower layers of HDF.


Use of This Manual

This manual is designed for software developers who are designing
applications or routines for use with HDF files and for users who
need detailed information about HDF. Users who are interested in
using HDF to store or manipulate their data do not normally need
the kind of detail presented in this manual. They should instead
consult a user manual, such as "HDF Calling Interfaces and
Utilities," "HDF Vset", or perhaps a manual having to do with
software that uses HDF.


Manual Contents

The manual is organized into the following chapters:

Chapter 1, "The Basic Structure of HDF Files," introduces and
describes the components and organization of Hierarchical Data
Format files.

Chapter 2, "HDF Software Overview," describes the organization of
the software layers that make up the basic HDF library.

Chapter 3, "The NCSA HDF General Purpose Interface," describes the
HDF modules that make up the general purpose HDF routines,
sometimes referred to as the lower layer of HDF.

Chapter 4, "Sets and Groups," explains the role of sets and groups
in an HDF file. It contains descriptions of raster image sets,
scientific datasets, and Vsets. Vsets are covered in more detail
in another chapter.

Chapter 5, "Annotations," explains how annotations are currently
organized in HDF files.

Chapter 6, "Number Conversion," describes the HDF module that is
used for number conversion.

Chapter 7, "Vsets," describes the structure and functioning of the
Vset module.

Chapter 8, "Portability," describes techniques and conventions used
in the HDF code to achieve portability.

Chapter 9, "HDF Conventions," presents guidelines regarding the use
of HDF that are not discussed elsewhere.


                        Table of Contents


Introduction
     Overview vii

     Why HDF vii

     What Is HDF viii

     Some History x

     Use of This Manual x

Chapter 1 The Basic Structure of HDF Files

     Chapter Overview 1.1

     File Header 1.1

     Data Object 1.1

     Physical Organization of HDF Files 1.4

     Sample HDF File 1.5

Chapter 2 Software Overview

     Chapter Overview 2.1

     Software Layers 2.1

     Organization of HDF Software 2.2

     Some HDF Conventions 2.5

Chapter 3 The NCSA HDF General Purpose Interface

     Chapter Overview 3.1

     Introduction 3.1

     Overview of the interface 3.2

     Function Specifications 3.6

Chapter 4 Sets and Groups

     Chapter Overview 4.1

     Sets 4.1

     Groups 4.2

     Raster Image Sets 4.4

     Scientific Datasets 4.6

     Vsets and Vdatas 4.12

     Appendix: The Raster-8 Set 4.13

Chapter 5 Annotations

     Chapter Overview 5.1

     Types of Annotations 5.1

     File Annotations 5.1

     Object Annotations 5.1

     Getting Reference Numbers for Object

     Annotations 5.2

Chapter 6 Tag Specifications

     Overview 6.1

     The HDF Tag Space 6.1

     Physical Storage Methods 6.1

     Specifications for Supported Tags 6.4

Chapter 7 Making HDF Portable

     Chapter Overview 7.1

     The HDF Environment 7.1

     Organization of Source Files 7.2

     Passing Strings Between.FORTRAN and C 7.5

     Function Return Values between FORTRAN and C 7.7

     Differences in Acceptable Routine Names 7.8

     ANSI C vs. Old C 7.11

     Type Differences 7.12

     Access to Library Functions 7.15


Figures and Tables

Figure 0.1   Raster Image Sets in an HDF File viii
Figure 1.1   Three Data Objects 1.1
Figure 1.2   A Data Descriptor 1.2
Figure 1.3   Model of a Data Descriptor Block 1.3
Figure 1.4   Sample Data Descriptor Block 1.4
Figure 1.5   Physical Representation of Data Objects 1.5
Figure 2.1   HDF software layers 2.1
Figure 4.1   Physical organization of Sample RIG Groupings 4.3
Figure 5.1   Three SDS Tags with Their Ref Numbers 5.1
Figure 5.2   Displayed Example of SDS, Ref #, and Annotation 5.2
Figure 6.1   Description Record for a Linked Block Element 6.2
Figure 6.2   A Linked Block Table 6.3
Figure 6.3   A Data Block 6.3
Figure 6.4   Description Record for an External Element 6.4
Figure 7.1   Illustration of the sequence of actions Involved when
               a FORTRAN call includes a string as a parameter 7.7

Table 1.1    Parts of a Data Descriptor 1.2
Table 1.2    Summary of the Relationships among Parts of an HDF
               File 1.4
Table 1.3    Sample Data Objects in an HDF File 1.5
Table 2.1    HDF 3.2 source code modules 2.5
Table 4.1    Tags for Raster Image Sets 4.5
Table 4.2    Additional tags for Raster Image Sets 4.5
Table 4.3    Required tags for SDG 4.8
Table 4.4    Optional Tags for SDG 4.
Table 4.5    Required tags for NDG 4.9
Table 4.6    Optional Tags for NDG 4.10
Table 4.7    Required Tags for NDG structure that is compatible
               with SDG structure 4.10
Table 4.8    Tags for Raster-8 Sets 4.14
Table 5.1    HDF Annotation tags 5.1
Table 6.1    Number Type Values 6.7
Table 6.2    Possible Machine Types 6.8
Table 6.3    Possible Tag Types in an RIG 6.12
Table 6.4    Color Format String Values 6.16
Table 6.5    Possible Tag Types in an NDG 6.21
Table 6.6    Possible calibrated data types 6.28
Table 6.7    Possible Tag Types in an SDG 6.34
Table 6.9    Scientific Data Dimension Record Fields 6.12


Chapter 1 The Basic Structure of HDF Files

Chapter Overview
File Header
Data Object
     Data Descriptor
     DD Blocks
     Data Element
Naming and Assigning Tags
Physical Organization of HDF Files
Sample HDF File


Chapter Overview

This chapter introduces and describes the components and
organization of Hierarchical Data Format (HDF) files.


File Header

The first component of an HDF file is the file header (FH), which
takes up the first four bytes in an HDF file. The file header is
a signature that indicates that the file is an HDF file.
Specifically, it is the 32-bit magic number with the 32-bit
hexadecimal value 0e031301.

NOTE: HDF assumes big-endian order in reading and writing files.
On some machines the order of bytes in the file header might be
swapped when the header is written to an HDF file, causing these
characters to be written in little endian. To maintain portability
of HDF files when developing software for such machines, you should
counteract this byte-swapping by making sure the characters are
read and written in the exact order shown.


Data Object

The basic building block in an HDF file is the data object, which
contains both data and information about the data. A data object
has two parts: a 12-byte data descriptor (DD) and a data element.
Figure 1.1 shows three examples of data objects.

As the names imply, the data descriptor gives information about the
data, and the data element it the data itself. In other words, all
data in an HDF file has attached to it information about itself.
In this sense, HDF files are examples of self-describing files.

ED. NOTE:  Figures are not available in this plain text version
of the specification.

Figure 1.1 Three Data Objects


Data Descriptor (DD)

A data descriptor (DD) has four fields: a 16-bit tag, a 16-bit
reference number, a 32-bit data offset, and  32-bit data length.
These parts of a DD are depicted in Figure 1.2 and are briefly
described in Table 1.1. Explanations of each part appear in the
paragraphs following Table 1.1.

*** INSERT FIGURE HERE ***

Table 1.1  Parts of a Data Descriptor

Part                Description

tag                 designates the type of data in a data element
reference number    uniquely distinguishes corresponding data
                    element from others with the same tag
data identifier     tag/ref; uniquely identifies data element
offset              byte offset of corresponding data element
length              length of data element


Tag

A tag is the part of a data descriptor that tells what kind of data
is contained in the corresponding data element. A tag is actually
a 16-bit unsigned integer between 1 and 65535, but every tag is
also usually given a name that programs can refer to instead of the
number. If a DD has no corresponding data element, the value of its
tag is DFTAG_NULL, indicating that no data is present.. A tag may
never be zero.

Tags are assigned by NCSA as part of the specification of HDF. The
following ranges are to be used to guide tag assignment:

00001 - 32767  reserved for NCSA use
32768 - 64999  user-definable
65000 - 65535  reserved for expansion of the format

Appendix A contains full specifications for all currently supported
NCSA HDF tags. Appendix B, "Assigned Tag Numbers," contains the
current number assignments. See the section 'Some HDF Conventions"
in the chapter "Software Overview" for more information on
allocating tags.


Reference Number

For each occurrence of a tag in an HDF file, a unique reference
number is stored with the tag in the data descriptor. Reference
numbers are 16-bit unsigned integers.

Reference numbers are not necessarily assigned consecutively, so
you cannot assume that the actual value of a reference number has
any meaning beyond providing a way of distinguishing among objects
with the same tag.


Data Identifier

The combination of a tag and its reference number uniquely
identifies the corresponding data object in the file. For this
reason, the tag/ref combination is sometimes referred to as a data
identifier.


Data Offset and Length

The data offset reflects the byte position of the corresponding
data element from the start of the file. The length gives the
number of bytes occupied by the data element. Offset and length are
both 32-bit unsigned integers.


DD Blocks

Data descriptors are stored physically in a linked list of blocks
called data descriptor blocks, or DD blocks. The individual
components of a data descriptor block are depicted in Figure 1.3.
All of the DDs in a DD block are assumed to contain significant
data unless they have a tag that is equal to DFTAG NULL (no data).

In addition to its DDs, each data descriptor block has a data
descriptor header (DDH). The DDH has two fields--a block size field
and a next block field. The block size field is a 16-bit unsigned
integer that indicates the number of DDS in the following DD block.
The next block field is a 32-bit unsigned integer giving the offset
of the next DD block, if there is one. The last DDH in the list
contains a 0 in its next block field.

*** INSERT FIGURE HERE ***


Data Element

A data element is the raw data part of a data object. Its basic
data type is determined by its tag, but other interpretive
information may be required before it can be processed properly.

Each data element is stored as a set of contiguous bytes starting
at the offset given in the corresponding DD (see Figure 1.4).(1)

*** INSERT FIGURE HERE ***


Physical Organization of HDF Files

Physically, the file header, DD blocks, and data elements are
organized as follows. The file header is followed by the first DD
block, which is followed by data elements and, if necessary, more
DD blocks. These relationships are summarized in Table 1.2.

There are no rules governing the distribution of DD blocks and data
elements within a file, except that the first DD block must follow
immediately after the file header. The pointers in the DD headers
connect the DD blocks in a linked list, and the offsets in the
individual DDs connect the DDS to the data elements. Beyond this
basic structure there is no assumed order among the objects in an
HDF file.

Table 1.2 Summary of the Relationships among Parts of an HDF File

Part           Constituents
HDF File       FH, DD-block, data, DD-block, data, DD-block,
                data ... 
F H            oxOe031301 (32 bit magic number)
DD-block       DDH, DD, DD, DD ... 
DDH            number-of-DDs (16 bits], offset-to-next-DD block (32
                 bits)
DD             tag (16 bits), ref [16 bits], offset (32
                 bits),length (32 bits)


(1) Some HDF software provides the capability of storing objects
as a series of linked blocks or external elements, but this occurs
at a higher level. At the lowest level each object with a tag/ref
is stored contiguously.

Sample HDF File

Consider an HDF file that contains two 400-by-600 8-bit raster
images. Typically, such a file might contain the objects described
in Table 1.3.

Table 1.3  Sample Data Objects in an HDF File

Tag  Ref  Data
FID   1   file identifier: user-assigned title for file
FD    1   file descriptor: user-assigned block of text
          describing overall file contents
IP8   1   Image palette (768 bytes)
ID8   1   x and y dimensions of the 2D arrays that contain
          the raster images (4 bytes)
RI8   1   first 2D array of raster image pixel data (x*y bytes)
RI8   2   second 2D array of pixel data (also x*y bytes)

Assuming, for example, that the size of a DD block is 10 DDs, the
physical organization of the contents of the file might be
described as shown in Figure 1.5.

Figure 1.5 Physical Representation of Data Objects

Offset Contents

     0    FH
     4    DDH       (10       0) 
    10    DD        (FID      1         130       4)
    22    DD        (FD       1         134       41)
    34    DD        (IP8      1         175       768)
    46    DD        (ID8      1         943       4)
    58    DD        (RI8      1         947       240000)
    70    DD        (RI8      2         240947    240000) 
    82    DD        (empty)
    94    DD        (empty)
   106    DD        (empty)
   118    DD        (empty)
   130    "sw3"
   134    "solar wind simulation: third try. 8/8/88"
   175    <data for the image palette>
   943    <data for the image dimensions>: 400, 600
   947    <data for the first raster image>
240947    <data for the second raster image>

In this instance, the file contains two raster images. The two
images have the same dimensions and are to be used with the same
palette. So, the same data objects for the palette (IP8) and
dimension record (ID8) can be used with both images.

Chapter 2      HDF Software Overview

               Chapter Overview
               Introduction
               Software Layers
               Organization of HDF Software 
                    Versions and Release Numbers
                    ANSI C and Portability
                    Modules and Interfaces
                    Header Files
                    The HDF Test Suite and Examples
               Some HDF Conventions
                    Naming and Assigning Tags
                    Using Reference Numbers to Organize Data Objects
                    Multiple References and File Compaction


Chapter Overview

This chapter contains a description of how HDF software is
organized. It also contains some guidelines on writing HDF
software.


HDF Software Layers

HDF-based software comes in four basic forms: an HDF interface
library, user programs that store and retrieve data in HDF files,
HDF command-line utilities, and HDF-based software tools.

The HDF interface library has two types of interfaces: (1) sets of
general purpose routines that form the basis of all higher-level
HDF development, and (2) application interfaces that support higher
level views of data.

User programs access HDF files via calls to the HDF library. User
programs are attached to the HDF library when they are compiled and
linked.

The HDF command-line utilities are a group of programs that are
distributed with the HDF library. The functionality of the
command-line utilities ranges from general purpose, such as listing
the contents of an HDF file, to special purpose, such as converting
data between different HDF data types (e.g., raster images to
scientific data sets). In general, the utilities perform data
management tasks.

In contrast, HDF-based software tools usually perform data analysis
tasks and have polished interactive user interfaces. They include
the NCSA Visualization Tool Suite and commercial software packages
that use HDF.

HDF software is implemented in layers, as illustrated in Figure
2.1. At the lowest level are the general purpose modules, which
perform basic I/O. At the next level are interfaces that reflect
commonly used objects such as B-bit raster images (RIS8) and
multidimensional arrays (SDS). At the top layer are users'
programs, utilities, and software tools such as the NCSA
visualization software.

*** INSERT FIGURE HERE ***

The general purpose interfaces are described in detail in this
document. Descriptions of the applications interfaces and
command-line utilities can be found in the manual "HDF Calling
Interfaces and Utilities." Each HDF-based software tool should have
its own manual.

Since the NCSA user community writes programs primarily in C and
Fortran, all of the HDF application interfaces developed at NCSA
are callable from both C and Fortran programs. Since the general
purpose interface is primarily for program development, not for
applications, it provides C routines only.


Organization of Software

Versions and Release Numbers
Since HDF is under continual development, new releases are
periodically made available. An HDF version number looks like
"3.2r1" which means that it is major version 3, minor version 2,
release 1. The three parts of a version number have different
meanings:

* A new major version number implies that there is some fundamental
  difference between this code and code with earlier major version
  numbers. When a new major version is made available, HDF users
  and developers are strongly encouraged to obtain the new source
  code and documentation. There will likely be added functionality
  in successive major versions.of the library and possibly some
  deletion of obsolete code, so some user code may have to be
  modified to use the new library.

* The meaning of a new minor version number is somewhat less well
  defined. It essentially means that there is some appreciable
  difference in the new code which was not deemed drastic enough
  to warrant a new major version, but is more substantial than a
  new release number would indicate.

* A new release number implies some bug fixes or other small
  modifications have been made to the code. Using a new release of
  the same version of the library will not usually require
  modification of existing user code.


ANSI C and Portability

In order to provide for easy porting of HDF to new platforms, all
versions of the HDF source code from version 3.2 on will be written
in ANSI standard C, with special provisions made for non-ANSI
compilers. For more information about porting HDF and writing
portable HDF-based code, refer to the chapter "Making HDF
Portable."


Modules and Interfaces

The HDF distribution contains many source files or modules which
can be grouped into families according to their root name. For
example, dfp.c, dfpf.c and dfpff.f all share the root name "dfp"
and, therefore, all belong to the "dfp" family. In general, each
family of source modules represents one HDF applications interface.
Thus, the "dfp" family together represent the HDF Palette
Interface. There are a few exceptions to this rule which will be
discussed later in this section.

For each interface, there is necessarily one file that contains the
C Code that provides the basic functionality of that interface. But
some interfaces may have one or two additional code modules that
provide Fortran callability for the interface. So there are three
possible family sizes:

1 file:
  Modules of this sort are generally not calling interfaces
  themselves, but rather provide useful support functions for
  actual calling interfaces. Since they are not meant to be called
  by any routine outside the HDF library itself, they do not need
  to be callable from Fortran programs. An example of such a module
  is hblocks.c.

2 files:
  Although there are currently no examples of this situation, it
  is conceivable (and desirable) that some future interface may
  need only one extra source module to provide Fortran
  compatibility. If this were to happen, there would only be two
  source modules for the interface. For instance, dfnew.c and
  dfnewf.c would make up the "New Interface."

3 files:
  Most current implementations of Fortran-callable HDF interfaces
  require the passing of character string arguments to some of
  their functions. Due to differences in the way C and Fortran
  represent strings, the passing of strings requires that there be
  a small amount of special purpose Fortran code written for each
  function that takes a string argument.

  For this reason, most Fortran-callable HDF interfaces consist of
  three source modules:

     (1) the primary C module,
     (2)a Fortran-callable C module, and 
     (3) a Fortran module.

  For example, dfsd.c, dfsdf.c and dfsdff.f make up the Scientific
  Data Set Interface. dfsd.c contains the basic functionality of
  the interface, dfsdf.c provides the major part of Fortran
  callability, and dfsdff.f contains the special purpose Fortran
  code that allows the passing of character string arguments.


Header Files

In addition to the source code modules discussed above, some
interfaces also have C header files associated with them that are
meant to be included by C applications programmers with the
"#include" preprocessor directive. They contain some useful
constants and data structures for interaction with the interface
from C programs. The header files can be identified by the same
name as the root name for the rest of the family with the ".h"
extension added. For example, dfsd.h is the header file for the
scientific Data Set Interface.

Of particular importance among the header files are hdf.h and
hdfi.h. hdf.h is the C header file that must be included by any
program that calls the HDF library. It contains all the symbolic
constants and public data structures that are needed to use HDF.
hdfi.h contains specific portability information about each
platform on which HDF is supported. It is automatically included
in programs when hdf.h is included, so programmers need not
explicitly include it. For more information on hdfi.h and other
portability issues, refer to the Chapter "Making HDF Portable.".

Table 2.1 shows all of the source code modules and header files
grouped into families for HDF 3.2.

Table 2.1 HDF 3.2 source code modules

general   general   grouping  utilities Vsets     Old
headers   purpose   (non-                         general
                    Vset)                         purpose
hdf.h     hfile.c   dfgroup.c dfutil.c   vg.c     dfstubs.c
hdfi.h    hfilef.c  dfgroup.h dfutilf.c  vgf.c    dff.c
hproto.h  hfileff.f           dfutilff.f vgff.f   dfff.f
dfivms.h  hkit.c              dfutil.h   vfp.c    df.h
          hblocks.c                      vgi.h    dfi.h
          hextelt.c                      vio.c    dfstubs.h
          herr.c                         vconv.c
          herrf.c                        vparse.c
          hfile.h                        vrw.c
          herr.h                         vsfld.c
                                         vg.h
                                         vproto.h

8/24 bit  general   palettes  scientifi annotatio special
raster    raster              c data    ns        FORTRAN
                              sets
dfr8.c    dfgr.c    dfp.c     dfsd.c    dfan.c    constants.f
dfr8f.c   dfgr.h    dfpf.c    dfsdf.c   dfanf.c   functions.f
dfr8ff.f  dfcomp.c  dfpff.f   dfsdff.f  dfanff.f
df24.c    dfimcomp.c          dfsd.h    dfan.h
df24f.c   dfrig.h
df24ff.f


The HDF Test Suite and Examples

In addition to the source code for the HDF library, versions 3.2
and higher will have an available suite of test programs There are
at least two test programs for most interfaces: one for the C
version and one for the Fortran-callable version. Some interfaces
have more than two test programs to test special features of that
interface and some have only one test program, since they only
provide C-callability.

Every effort will be made to ensure that the test programs provide
a thorough and accurate assessment of the health of the HDF
library. Although it is hoped that the test suite will greatly
improve the reliability of HDF code, it is almost inevitable that
some parts of the code will be untested. Therefore, no guarantees
can be made on the basis of test suite performance.

There is also a set of example programs to help users write HDF
programs. They illustrate some of the common ways in which users
program with HDF.


Some HDF Conventions

The specification of HDF described in the previous chapter is not
sufficient to guarantee its success. It is also important for users
to adhere to certain conventions in using HDF. Guidelines in the
use of HDF are implicit in many discussions in other sections of
this document, and others are presented in the manual "HDF Calling
Interfaces and Utilities." Guidelines not covered elsewhere are
introduced in this section.


Naming and Assigning Tags

Tags that are to be made available to a general population of HDF
users should be assigned and controlled by NCSA. Tags of this type
are given numbers in the range 1-32,767. If you have an application
that fits this criterion, contact NCSA at the address listed on the
README page at the beginning of this manual and specify the tags
you would like. For each tag, your specifications should include
a suggested name, information about the type and structure of the
data that the tag will refer to, and information about how the tag
will be used. Your specifications should be similar to those
contained in Appendix A. NCSA will assign you a set of tags for
your application and include your tag descriptions in its
documentation.

Tags in the range 32,768-64,999 are user-definable. That is, you
can assign them for any private application. Of course, if you use
tags in this range you need to be aware that they may conflict with
other people's private tags.


Using Reference Numbers to Organize Data Objects

The HDF library itself uses reference numbers solely for the
purpose of distinguishing between different objects with the same
tag. While application programmers may find it convenient to impart
some meaning to reference numbers, they should be forewarned that
the HDF library will be ignorant of any such meaning. In other
words, any meaning attached to reference numbers exists only at the
application program or software tool level.

Some users have used reference numbers to indicate how objects
should be grouped by considering all objects with the same
reference number to be part of the same group. This practice is not
recommended. Instead, if object grouping is desired it is
recommended that you use either the simple grouping procedures used
by the SDS, RIS8, and RIS24 applications (supported by the routines
in dfgroup.c), or the more general (and more complex) Vset
structures.

Another possible use of reference numbers is for keyed access to
HDF objects. An HDF data identifier (tag/ref) provides an unique
identifier for any HDF object within a file, and hence could be
used as a primary key for that object. One could keep a table of
data identifiers as a way of providing random access to HDF
objects.

Reference numbers might also be used to impose an ordering on HDF
objects. Once again, because the assignment scheme for reference
numbers in HDF files does not guarantee any order, caution is
advised in this uses of reference numbers.


Multiple References

Multiple references to a single data element are quite common in
HDF. The general purpose routine Hdupdd generates a new reference
to data that is already pointed to by another DD. If Hdupdd is used
several times, there could be several DDs that point to the same
data element.

It is important to note that when a multiply-referenced data
element is deleted or moved, the various DDs that previously
pointed to the data element are not automatically deleted or
adjusted to point to the data element in its new location.
Consequently, each DD to be deleted or moved should be checked for
multiple references and handled as the programmer sees fit.


Chapter 3      The NCSA HDF General Purpose Interface

               Chapter Overview
               Introduction
               Overview of the Interface
               Function Specifications
                    Opening and Closing Files
                    Finding Tags, Refs, and Element Lengths
                    Reading and Writing Entire Data Elements
                    Reading and Writing Part of a Data Element
                    Manipulating Data Descriptors (DDs)
                    Creating Special Data Elements
                    Development Routines
                    Error Reporting


Chapter Overview

This chapter contains a detailed description of the routines that
make up the general purpose HDF interface.

Introduction

NCSA supports interfaces for HDF users--both high level interfaces
to support certain application areas, such as image processing, and
low level general purpose interfaces for performing basic
operations on HDF files. These interfaces are written in C only but
most functions are typically accessible from Fortran.

The routines in the general purpose interface enable you to build
and manipulate HDF objects of any type, including those of your own
invention. All HDF applications developed at NCSA use these
routines as their basic building blocks.

The routines described in this chapter represent a second set of
general purpose routines. All HDF applications prior to HDF 3.2
(released in June 1992) used an earlier set of general purpose
routines. These low level general purpose routines have been
changed to allow for better functionality. Old routines will still
be emulated but at a cost of reduced functionality. Users are
strongly advised to use the new interface.

The new lower layer, first used with HDF Version 3.2, incorporates
the following improvements over its predecessor:

* More consistent data and function types.

* An error handling module that supports more meaningful and
  extensive reporting of errors.

* Simplification of key lower level functions.

* Simplified techniques for facilitating portability.

* Support for alternate forms of physical storage, such as linked
  blocks storage, and storage of the data portion of an object in
  an external file.

* A version tag indicating which version of the HDF library last
  changed an HDF file.

* Support for simultaneous access to multiple files.

* Support for simultaneous access to multiple objects within a
  single file.

The previous lower layer is called the "DF layer", because all
routines began with the letters "DF", as in "DFopen" and "DFclose."
The new layer is called the "H layer" because all routines begin
with the letter "H" (Hopen, Hclose, Hwrite, etc.). The source
modules that implement these changes can be found in files that
begin with the letter "h".

Also, the number of basic source modules has changed, and now
includes:

hfile.c          basic I/O
herr.c           error-handling
hkit.c           general purpose routines
hblocks.c        to support linked block physical storage
hextelt.c        to support external storage of HDF data

Overview of the interface

Following is a listing of the public functions that can be found
in the general purpose interface. This section provides
specifications and descriptions of these routines.

Opening and Closing HDF Files

These calls are used to open and close HDF files.

Hopen            Provides an access path to an HDF file. It also
                 reads into memory all of the DD blocks in the
                 file.

Hclose           Closes the access path to a file.

Locating Elements for Access and Getting Information

These routines make it possible to locate elements or find out
other information. Except for Hendaccess, they initialize the
element that they locate and return an access id that is used in
later references to the data element. Calls to them can include
wild cards so that one can search for unknown tags and refs.

Hstartread       Locates an existing data element with matching
                 tag/ref and returns an access id for reading it.

Hnextread        Continues the search with the same access id.

Hstartwrite      Allows writing to the object with the supplied
                 tag/ref. If the object exists, the object will be
                 modified, otherwise it is created.

Hendaccess       Disposes of access id for tag/ref.

Hinquire         Returns access information about a data element.

Hishdf           Determines whether a file is an HDF file.

Hnumber          Returns the number of occurrences of a specified
                 data identifier (tag/ref) in a file.

Hgetlibversion   Returns version information for the current HDF
                 library

Hgetfileversion  Returns version information for an HDF file


Reading and Writing Entire Data Elements

There are two sets of routines for reading and writing data
elements. The set of routines described here is used to store and
retrieve entire data elements. A second set of routines, described
in the next section, may be used if you wish to access only part
of a data element at a time.

Hputelement      Adds or replaces elements in a file.

Hgetelement      Obtains the data referred to by the tag/ref
                 combination that is passed to it.


Reading and Writing Part of a Data Element

The second set of routines for reading and writing data elements
makes it possible to read or write all or part of a data element,
in contrast to the routines described above which can only read or
write an entire element. One of the access routines Hstartread or
Hstartwrite must be called before calling these routines.

Hwrite           Appends data to a data element. It starts at the
                 last position left by a Hwrite or Hseek command,
                 writes up to a specified number of bytes, then
                 leaves the access pointer at the end of the data
                 written.

Hread            Reads a portion of a data element. It starts at
                 the last position left by a Hread or Hseek command
                 and reads any data that remains in the element up
                 to a specified number of bytes.

Hseek            Sets the access pointer to an offset within a data
                 element. The next time Hread or Hwrite is called,
                 the access occurs from the new position. The
                 location to seek to can be specified as an offset
                 from the current location or from the start of the
                 element.


Manipulating Data Descriptors (DDs)

These routines perform operations on DDs without doing anything
with the data to which the DDs refer.

Hdupdd           Is used to generate new references to data that
                 is already referenced from somewhere else.

Hdeldd           Deletes a tag/ref from the list of DDs.

Hnewref          Returns the next available reference number for
                 the HDF file.


Creating Special Data Elements

HDF 3.2 introduces two alternate methods of physical storage for
HDF objects. Previously, all of the objects in an HDF "file" had
to be in the same file and any given object had to be contiguous.
This last requirement caused many problems, especially with regard
to appending to existing objects. Objects needed to be deleted and
rewritten to the end of the file in order to append to them.

The two new storage methods are "linked blocks" and "external
elements". Linked blocks allow elements in a single HDF file to be
non-contiguous. External elements allow a single HDF object to be
stored in an external file. It is not currently possible to have
a single object (such as a very large data set) stored in multiple
files. Nor is it possible to have multiple objects stored in an
"external" file.

Special data elements can be accessed with the same routines as for
normal data elements once they are created. These routines create
special data elements.

HLcreate         Creates a new linked block special data element.

HXcreate         Creates a new external file special data element.

Both of these routines have two modes of operation. For example,
calling HLcreate with a tag and ref which do not exist in a file
will create i new element with the given tag and ref that will be
stored as linked blocks. On the other hand, if the tag/ref pair
already existed in the file, the referenced object is "promoted"
to being stored as linked blocks. All data which had been stored
in the object before the promotion is retained. HXcreate behaves
similarly.

Development Routines

The HDF library provides a number of "developer" level routines
that are meant to simplify the task of writing HDF applications.
most of these routines mirror basic C library functions which are,
unfortunately, not always completely portable in their library
form.

HDgettagname     Return a pointer to a text string describing a
                 given tag.

HDgetapace       Allocate space.

HDfreespace      Free space.

HDstrncpy        Copy a string from one location to another up to
                 a given number of characters.


Error Reporting

The HDF library now provides a much more robust error reporting
scheme. Previously, only a single error value could be returned to
the user. There is now the notion of an error stack. This allows
for more of the context to be known when trying to decipher a
problem.

HEprint          Print out all of the errors on the error stack to
                 a specified nfile.

HEclear          Clear the error stack.

HERROR           Macro to report an error. This will push the error
                 type, file name, line number and name of the
                 function reporting the error.

HEreport         Add a text string to the description of the most
                 recently reported error. Only a single text string
                 may be supplied per error.

The only problem with the error module is that standard C does not
have any way for the code inside a function to know the name of the
function. Therefore, in order to use the macro HERROR to report
errors, there must exist a variable FUNC which points to a string
containing the name of the reporting function.

Other

Hsync            Synchronize stored version of HDF file with image
                 in memory.


Function Specifications

Opening and Closing files

Hopen

int32 Hopen(char *path, int access, int16 ndds)

     path   IN:  Name of file to be opened
     access IN:  DFACC_READ, DFACC_WRITE, DFACC_CREATE or
                 anybitwise-or of the above
     ndds   OUT: Number of dds in a block if this file needs to be
                 created

     Purpose: Provides an access path to an HDF file. It also reads
     into primary memory all of the DD blocks in the file.

     Returns: On success returns file id, on failure returns FAIL.

     Description: Opens an HDF file.

     Interpretations of access:
           HDF provides several constants for use as access
           privilege codes. Below is a list of these codes and
           their meanings. It is important to note that these
           constants are NOT bitflags and should NOT be or'd
           together to combine access modes. Doing so may cause odd
           behavior and, in some cases, loss of data.

           Recommended:
             DFACC_READ:    Open for read only. If file does not
                            exist, error.
             DFACC_RDWR:    Open for read/write. If file does not
                            exist, create it.
             DFACC_CREATE:  Force creation. If file exists, delete
                            it, then open a new file for
                            read/write. (in the spirit of UNIX
                            "clobber")

           Others:
             DFACC_ALL:     Same as DFACC_RDWR.
             DFACC_WRITE:   Same as DFACC_RDWR.

On successful exit,
* File_rec members are filled in.
* File is opened with the relevant permission.
* Information about dd's are set up in memory.

For a new file, in addition,
* The file headers and initial information are set up.

Hclose

intn Hclose(int32 id)

     id            IN: the file id of the file to be closed

     Purpose:      Closes the access path to the file.

     Returns:      SUCCEED (0) if successful and FAIL (-1) if
                   failed.

     Description:  Id is first validated. If valid, the function
                   closes the acces path to the file.

                   If there are still access elements attached to
                   the file, the e DFE_OPENAID is returned and the
                   file is not closed.

                   This is a fairly common error when developing
                   new interfaces. the discussion of Hendaccess
                   below for hints on how to debug problem.

Locating Elements for Access and Getting Information

Hstartread

int32 Hstartread(int fileid, int tag, int ref)

     fileid        IN: id of file to attach access element to
     tag           IN: tag to search for
     ref           IN: ref to search for

     Purpose:      Locate an existing data element with matching
                   tag/ref and return a descriptor for reading it.

     Returns:      On success returns id of access element if
                   successful, otherwise FAIL (-1).

     Description:  Searches the DD's for a particular tag/ref
                   combination. Wildcards can be used for tag or
                   ref (DFTAG_WILDCARD, DFREF_WILDCARD) and they
                   match any values. Searching on wildcards begins
                   from the beginning of the DD list. If the search
                   is successful, the access element is positioned
                   to the start of that tag/ref, otherwise it is
                   an error. An access element is created and
                   attached to the file.

Hnextread

intn Hnextread(int32 access_id, int16 tag, int16 ref, int origin)

     access_id     IN: Id of a READ access elt
     tag           IN: the tag to search for
     ref           IN: ref to search for
     origin        IN: from where to start searching

     Purpose:      Locate and position a read access id on next
                   occurrence of tag/ref.

     Returns:      SUCCEED (0) if successful and FAIL (-1)
                   otherwise.

     Description:  Searches for the "next" DD that fits the
                   tag/ref. Wildcards apply. If origin is DF_START,
                   search from start of DD list, if origin is
                   DF_CURRENT, search from current position.
                   Searching from the end of the file via DF_END
                   is not yet implemented.

                   If the search is successful, then the access
                   element is positioned at the start of that
                   tag/ref, otherwise, the access_id is not
                   modified.

Hstartwrite

int32 Hstartwrite(int fileid, int tag, int ref, long len)

     fileid        IN: Id of file to write to
     tag           IN: tag to write to
     ref           IN: ref to write to
     length        IN: the length of the data element

     Purpose:      Creates or replace data element with matching
                   tag/ref.

     Returns:      Id of access element if successful and FAIL
                   otherwise.

     Description:  Set up an access element to write out a data
                   element. DD list of the file is searched first.
                   If the tag/ref is four the data element is NOT
                   replaced; rather, it is then possible modify the
                   existing data. If an object with the
                   corresponding and ref does not exist, a new one
                   is created.

Hendaccess

int32 Hendaccess(int access_id)

     access-id     IN: id of access element to dispose of

     Purpose:      Disposes of descriptor for tag/ref.

     Returns:      returns SUCCEED (0) if successful, FAIL (-1)
                   otherwise.

     Description:  Used to dispose of an access element. There is
                   only a finite number of access elements allowed
                   to be active at a time. Therefore, it is very
                   important to call Hendaccess whenever you are
                   done using an element.

                   When developing new interfaces, we have found
                   that a fairly common mistake is to not call
                   Hendaccess for all of the elements accessed.
                   When this happens, Hclose will return FAIL, and
                   the dump of the error stack (see HEprint, below)
                   will tell how many access elements are still
                   active.

                   This is a rather difficult problem to debug, as
                   the low level the HDF library have really no
                   idea who and where opened an access element and
                   forgot to release it. It's tedious, but the most
                   effective means we have found to debug this
                   problem is to annotate the locations where the
                   `attached' count of a file record is changed
                   (there are a couple of places in hfile.c ar few
                   in hblocks.c and hextelt.c).

Hinquire

intn Hinquire(int access_id, int32 *pfile_id, uint16 *ptag, uintl6
       *pref, int32 *plength, int32 *poffset, int32 *pposn, int
       *paccess, int *pspecial)

     access_id     IN: Id of an access elt
     pfile_id     OUT: file id
     ptag         OUT: tag of the element pointed to
     pref         OUT: ref of the element pointed to
     plength      OUT: length of the element pointed to
     poffset      OUT: offset of elt in the file
     pposn        OUT: position pointed to within the data elt
     paccess      OUT: the access type of this access elt
     pspecial     OUT: special code

     Purpose:      Returns access information of a data element.

     Returns:      Returns SUCCEED (0) if the access elt points to
                   some data element,  otherwise FAIL (-1).

     Description:  Inquire statistics of the data element pointed
                   to by access element. If a piece of information
                   is not needed, it is possible to send NULL in
                   for that value. There are a set of convenience
                   macros for calls to Hinquire (HQuerypositon,
                   HQuerylength, etc ... ) defined in hdf.h.

Hishdf

int32 Hishdf(char *Path)

     path          IN: name of file

     Purpose:      Determine if a file is an HDF file.

     Returns:      Returns TRUE (non-zero) if file is HDF, FALSE
                   (0) otherwise.

     Description:  The decision of where a file is and HDF file or
                   not is based solely on the magic number stored
                   in the first four bytes of an HDF file. It is
                   possible that Hishdf will identify a file as an
                   HDF file but Hopen will be unable to open the
                   file (for example if the DD list in the file is
                   corrupted).

Hnumber

int Hnumber(int32 file-id, uint16 tag)

     file id       IN: file id
     tag           IN: tag to be counted

     Purpose:      Find the number of occurrences of tag/ref in
                   file.

     Returns:      The number of instances of a tag in a file.

Hgetlibversion

Hgetlibversion--return version info for current HDF library

USAGE

Hgetlibversion(uint32 *majorv, uint32 *minorv, uint32 *release,
char string[])

     majorv       OUT: majorv version number
     minorv       OUT: minorv version number
     release      OUT: release number
     string       OUT: informational text string (80 chars)

     Purpose:     Get version information for current HDF library.

     Returns:     Returns SUCCEED (0).

     Description: Returns the version of the HDF library. The
                  version information is statistically compiled
                  into the HDF library, so it is not necessary to
                  have any open files for this function to execute.

Hgetfileversion

Hgetfileversion--return version info for HDF file

USAGE

Hgetfileversion(uint32 file-id, uint32 *majorv, uint32 *minorv,
uint32 *release, char string[])

     file_id       IN: handle of file
     majorv       OUT: majorv version number
     *minorv      OUT: minorv version number
     release      OUT: release number
     string       OUT: Informational text string (80 chars)

     Purpose:     Get version information for an HDF file.

     Returns:     Returns SUCCEED (0) if successful and FAIL (-1)
                  if failed.

     Description: Returns the HDF version number stored in the
                  given file. It is still an open question as to
                  what exactly the version number of a file should
                  mean, so we recommend that user code not call
                  this function.

Reading and Writing Entire Data Elements

Hputelement

int Hputelement(int fileid, int tag, int ref,.char *data, long
length)

     fileid        IN: Id of file
     tag           IN: tag of data element to put
     ref           IN: ref of data element to put
     data          IN: pointer to buffer
     length        IN: length of data

     Purpose:      Add or replace element in a file.

     Returns:      Returns SUCCEED (0) if successful and FAIL (-1)
                   otherwise.

     Description:  Writes a data element or replace an existing
                   data element in a HDF file. Uses Hwrite and its
                   associated routines.

Hgetelement

int Hgetelement(int file_id, int tag, int ref, char *data)

     file_id       IN: Id of the file to read from
     tag           IN: tag of data element to read
     ref           IN: ref of data element to read
     data         OUT: buffer to read into

     Purpose:      Obtains the data referred to by the tag/ref
                   combination that passed to it.

     Returns:      Returns SUCCEED (0) if successful, FAIL (-1)
                   otherwise.

     Description:  Read in a data element from a HDF file and puts
                   it into buffer pointed to by data. The space
                   allocated for buffer is assumed to be large
                   enough.

Reading and Writing Part of a Data Element

Hread

int32 Hread(int access_id, long length, char *data)

     access_id     IN: Id of READ access element
     length        IN: length of segment to read in
     data         OUT: pointer to data array to read to

     Purpose:      Read a portion of a data element.

     Returns:      Returns length of segment actually read in if
                   successful and FAIL otherwise.

     Description:  Read in the next segment in the data element
                   pointed to by .the access element. It starts at
                   the last position left by a Hread, or Hseek
                   command and reads any data that remains in the
                   element up to a specified number of bytes. If
                   the data element is too short then it only reads
                   to end of the data element.

Hwrite

int32 Hwrite(int access_id, long len, char *data)

     access_id     IN: Id of WRITE access element
     len           IN: length of segment to write
     data          IN:	pointer to data to write

     Purpose:      Write next data segment to data element.

     Returns:      Returns length of segment successfully written,
                   FAIL (-1) otherwise.

     Description:  Write the data to data element where the last
                   write or Hseek() stopped. It starts at the last
                   position left by a Hwrite command, writes up to
                   a specified number of bytes, then leaves the
                   write pointer at the end of the element. If the
                   space reserved is less than the length to write,
                   then only as much as can fit is written. It is
                   the responsibility of the user to insure that
                   no two access elements are writing to the same
                   data element. It is possible to interlace writes
                   to more than one data elements in the same file
                   though.

Hseek

intn Hseek(int32 access_id, long offset, int origin)

     access_id     IN: Id of access element
     offset        IN: offset to seek to
     origin        IN: position to seek from by offset, 0: from
                       beginning; 1: current position; 2: end of
                       data element

     Purpose:      Set the access pointer to an offset within a
                   data element. The next time Hread or Hwrite is
                   called, the read or write occurs from the new
                   position.

     Returns:      Returns FAIL (-1) if fail, SUCCEED (0)
                   otherwise.

     Description:  Sets the position of an access element in a data
                   element that the next Hread or Hwrite will start
                   from that position. origin determines the
                   position from which the offset should be added.
                   This routine fails if the access element is not
                   associated with any data element and if the
                   seeked position is outside c the data element.

                   Seeking from the end of a data element is not
                   currently supported.

Manipulating Data Descriptors

Hdupdd

int Hdupdd(int32 file_id, uint16 tag, uint16 ref, uint16 old_tag,
uint16 old_ref)

     file id       IN: Id of file
     tag           IN: tag of new data descriptor
     ref           IN: ref of new data descriptor
     old_tag       IN: tag of data descriptor to duplicate
     old_ref       IN: ref of data descriptor to duplicate

     Purpose:      Generate new references to data that is already
                   referenced from somewhere else.

     Returns:      Returns SUCCEED (0) if successful, FAIL (-1)
                   otherwise.

     Description:  Duplicates a data descriptor so that the new
                   tag/ref points to the same data element pointed
                   to by the old tag/ref.

Hdeldd

int Hdeldd(int file_id, int tag, int ref)

     file id       IN: Id of file
     tag           IN: tag of data descriptor to delete
     ref           IN: ref of data descriptor to delete

     Purpose:      Delete a tag/ref from the list of DDs.

     Returns:      Returns SUCCEED (0) if successful, FAIL (-1)
                   otherwise.

     Description:  Deletes a data descriptor of tag/ref from the
                   dd list of the file. This routine is unsafe and
                   may leave a file in a condition that is not
                   usable by some routines. Use with care.

Hnewref

uint16 Hnewref(int32 file_id)

     file-id       IN: id of file

     Purpose:      Return the next available ref for HDF file.

     Returns:      Returns the ref number, 0 otherwise.

     Description:  Returns a ref number that can be used with any
                   tag to produce a unique tag/ref. Successive
                   calls to Hnewref will generate a strictly
                   increasing sequence until the highest possible
                   ref had been returned, then Hnewref will return
                   unused ref's starting from 1.

Creating Special Data Elements

HLcreate

int32 HLcreate(int32 file_id, uint16 tag, uint16 ref, int32
block_length, int32 number_blocks)

     file_id       IN: Id of file
     tag           IN: tag of new data descriptor
     ref           IN: ref of new data descriptor
     block_length  IN: length of blocks to be used
     number-blocks IN: number of blocks to use per linked block
                       record

     Purpose:      Create a new linked block special data element.

     Returns:      Access Id for special data element if
                   successful, otherwise (-1).

     Description:  Appending to existing elements has been a
                   problem in HDF in the past as HDF objects were
                   required to be stored contiguous. When
                   appending, the HDF library had forced the
                   use to delete the existing element and move it
                   to the end. With HDF 3.2 we had added the
                   concept of linked blocks which allow unlimited
                   appending to existing elements without copying
                   over existing data.

                   Initially, a table is set up to accommodate
                   numer_blocks linked blocks for this object. Each
                   block has size block_length bytes. If an
                   existing object is being promoted, block_length
                   does not have to be the same size as the
                   original element.

                   This routine can be used to either create an
                   object with the given tag ref as a linked block
                   element, or promote an existing element to be
                   stored with linked blocks. This routine will
                   return an active access id with write permission
                   to the linked block element.

HXcreate

int32 HXcreate(int32 file_id, uint16 tag, uint16 ref, char
*extern_file_name)

     file_id       IN: file record id
     tag, ref      IN: tag/ref of the special data element to
                       create
     extern_file_name
                   IN: name of external file to use as data element

     Purpose:      Create a new external file special data element.

     Returns:      Access id for special data element if
                   successful, otherwise FAIL (-1).

     Description:  This routine is used to create a new element in
                   an external file or promote an existing element
                   to be in an external file. if an existing
                   element is to be promoted, it is deleted from
                   the original file and copied over into the new
                   external file.

                   Distributing a single object over multiple
                   external files is currently not supported. In
                   addition, it is not possible to place multiple
                   objects into the same external file. This
                   routine will return an active access id with
                   write permission to the external element.

Development Routines

HDgettagname

char *HDgettagname(uint16 tag)

     tag           IN: tag to look up

     Purpose:      Get a meaningful description of a tag.

     Returns:      A pointer to a string describing this tag or
                   NULL if the tag unknown.
     Description:  To reduce on the amount of reduplicated code,
                   this rout can be used to map a tag to a
                   character string containing the name of the tag.
                   If the tag is unknown, NULL is returned as
                   programs may have different ways of dealing with
                   unknown tags

                   For formatting purposes, the string returned by
                   this routine guaranteed to be 30 characters or
                   less.

HDgetspace

void *HDgetspace(uint32 qty)

     qty           IN: number of bytes to allocate

     Purpose:      Allocate space.

     Returns:      Pointer to space that was allocated.

     Description:  This routine is very platform-dependent. It uses
                   an appropriate allocation routine on the local
                   machine to get space

HDfreespace

void *HDfreespace(void *ptr)

     ptr           IN: pointer to previously-allocated space to be
                       freed

     Purpose:      Free space.

     Returns:      NULL.

     Description:  It uses an appropriate routine on the local
                   machine to space.

HDstrncpy

char *HDstrncpy(register char *dest,register char *source,int32
len)

     dest         OUT: pointer to area to copy string to
     src           IN: pointer to area to copy string from
     len           IN: maximum number.of bytes to copy

     Purpose:      Copy a string with some maximum length.

     Returns:      Address of dest.

     Description:  This function creates a string in dest that is
                   at most len' characters long. The `len'
                   characters include the NULL terminator, which
                   must be added for historical reasons. Hence, if
                   you have the string 'Foo\0' you must call this
                   copy function with len = 4

Error Reporting

HEprint

void HEprint(FILE *stream, int level)

     stream        IN: stream to print error messages on
     level         IN: level of the error stack to print

     Purpose:      Print out information on the error stack.

     Returns:      No return value.

     Description:  This routine will print out information on
                   reported errors. If level is zero all of the
                   errors currently on the error stack are printed.
                   Output of this function is sent to the file
                   point to by stream.

                   Information printed is: an ascii description of
                   the error, the reporting routine, its file name
                   and the line at which the error was reported.
                   In addition, if the programmer has supplied
                   extra information by means of HEreport, this
                   information is printed well.

HEclear

void HEclear(void)

     Purpose:      Clear all information on reported errors off of
                   the error stack

     Returns:      No return values.

     Description:  Clear all of the information off of the error
                   stack.

HERROR

void HERROR(int number)

     number        IN: error number

     Purpose:      Report an error.

     Returns:      No return value.

     Description:  HERROR can be used to report an error. Any
                   function which calls HERROR must have a variable
                   FUNC which points to a string containing the
                   name of the function.

                   HERROR is implemented as a macro.

HEreport

void HEreport(char *format, ... )

     format        IN: printf style format and arguments

     Purpose:      Provide extra information to the error reporting
                   routines.

     Returns:      No return value.

     Description:  This routine can be used to provide further
                   annotation to an error report. Only one such
                   annotation is remembered for each error report.
                   The arguments to this routine follow the style
                   of printf.

                   An example from hfile.c

                   char *FUNC = "Hclose";
                   ...
                   if (file_rec->attach > 0) {
                         file rec>refcount++;
                         HERROR(DFE_OPENAID);
                         HEreport("There are still %d active aids
                         attached",
                         file rec->attach)
                         return FAIL;

Other

Hsync

int Hsync(int32 file id)

     file_id       IN: id of the file to sync

     Purpose:      Synchronize on-disk HDF file with image in
                   memory.

     Returns:      Returns SUCCEED.

     Description:  This routine is currently vacuous as the on-disk
                   representation of an HDF file is always the same
                   as its in-me representation. However, future
                   releases of the HDF library n employ buffering
                   schemes, so this might not always be the case.
                   Hsync will be provided to force the two
                   representations to be consistent.


Chapter 4      Sets and Groups

               Chapter Overview
               Sets
                   Types of Sets
                   Calling Interfaces for Sets
               Groups
                   Sample Groups
                   General Features of Groups
               Raster Image Sets
                   Raster Image Groups
                   Tags for Raster Image Sets
                   Compression of Raster Images
               Scientific Datasets
                   Required Tags
                   Optional Tags
               Vsets and Vdatas
               Chapter Appendix: Raster-8 Sets
                   Compatibility between Raster-8 and Raster Image Sets


Chapter Overview

This chapter describes raster image sets, scientific datasets and
Vsets, and explains the role of sets and groups in an HDF file. It
also discusses the programming interfaces available for the three
types of sets.

Sets

Sometimes tags are grouped into sets, where each set is designed
to serve a particular user requirement. For example, the raster
image set that is described in the following sections, contains
several tags that are used for storing information about 8-bit
raster images.

Types of Sets

In the current implementation of HDF there are three kinds of sets:

* A raster image set contains a raster image, along with
  descriptive information about the image, such as its dimensions
  and (optionally) a color lookup table.

* A scientific data set contains a multidimensional array, along
  with descriptive information about the data.

* A Vset is a general grouping structure that can contain any kinds
  of HDF objects that a user wishes.

Each HDF set is defined in terms of a minimum collection of data
objects that must be present for the set to make sense when it is
used. For instance, every raster image set must contain at least
the following three data objects:

* an image dimension record, which gives the width and height of
  the corresponding image;

* raster image data, which consists of the pixel values that make
  up the image;

* a raster image group, which lists all of the members in the set.

In addition to the required objects, there are optional data
objects that may be included in a set. A raster image set, for
instance, often contains a palette, or color lookup table, which
gives the red, green, and blue values to be associated with each
pixel in the raster image data.

Calling Interfaces for Sets

NCSA provides calling interfaces for all the HDF sets that it
supports. The primary purpose of these calling interfaces is to
provide libraries of routines for reading and writing the data that
is associated with each set. The libraries currently supported at
NCSA are callable from either C or Fortran programs.

In addition to the libraries, a growing number of command-line
utility routines are available for working with sets. For example,
a utility called r8tohdf is an HDF command that converts one or
more raw raster images to HDF 8-bit raster image set format.

NCSA supports calling interfaces for the following machines: Cray
(UNICOS), Silicon Graphics (UNIX), Sun (UNIX), Macintosh (MacOS),
and IBM PC (MS-DOS). The calling interfaces that are currently
available are described in the manual NCSA HDF Calling Interfaces
and Utilities.

Groups

An HDF set is a collection of HDF data objects in a file. Unless
some mechanism is used to identify explicitly those objects that
belong to a set, there is often no way to tie them together. This
problem is solved in HDF by means of groups. A group is a data
object that explicitly identifies all of the data objects in a set.

Since a group is a type of data object, its structure is like that
of any other data object. A group data identifier (tag/ref) points
to a data element that consists of the collection of data
identifiers that make up the corresponding set. A group tag can be
defined for any set. For instance, raster image group (RIG) is the
group tag used to group members of raster image sets; RIG data
consists of a list of all data identifiers that belong to a
particular raster image set.

Groups provide a convenient mechanism for. application programs to
locate all of the information that they need about a set.
Application programs that deal with RIGs, for instance, read all
of the elements in a RIG group, using only those that they need for
their application and ignoring the others.

Sample Groups

Suppose that the two images shown in Figure 1.5 are organized into
two sets with group tags. Since they are images, they may be stored
as RIG groups. Figure 4.1 illustrates the type of organization that
incorporates RIG groupings of these images.

Figure 4.1  Physical Organization of Sample RIG Grouping

Offset         Contents
   0      FH
   4      DDH       (10       OL)
  10      DD        (FID      1         130       4)
  22      DD        (FD       1         134       41)
  34      DD        (IP8      1         175       768)
  46      DD        (ID       1         943       4)
  58      DD        (RI       1         947       240000)
  70      DD        (ID       2         240947    4)
  82      DD        (RI       2         240951    240000)
  94      DD        (RIG      1         480951    12)
 106      DD        (RIG      2         480963    12)
 118      DD        (empty)
 130      "sw3"
 134      "solar wind simulation: third try. 8/8/88"
 175      <data for image palette>
 943      <data for 1st image dimension rec>: 400, 600
 947      <data for 1st raster image>
240947    <data for 2nd image dimension rec>: 400, 600
240951    <data for 2nd raster image>
480951    tag/refs for 1st RIG: IP8/1, ID/1, RI/1
480963    tag/refs for 2nd RIG: IP8/1, ID/2, RI/2

The structure depicted in Figure 4.1 reflects the grouping of
raster image sets. This file contains the same raster image
information as the file in Figure 1.5, but the information is
organized into two sets and groups. Note that there is only one
palette (IP8/1) and it is included in both groups.

General Features of Groups

Figure 4.1 also illustrates a number of important general features
of groups:

* The contents of each set are consistent with one another. Since
  the palette (IP8) is designed for use with 8-bit images, the
  image must be an 8-bit image, rather than a 24-bit, 12-bit, or
  other image.

* An application program can easily process all of the images in
  the file by accessing the groups in the file. The non-RIG
  information contained in the file can be used or ignored,
  depending on the needs and capabilities of the application
  program.

* There is usually more than one way to group sets. For example,
  an extra copy of the image palette (IP8) could have been stored
  in the file, so that each grouping would have its own image
  palette. But in this instance that is not necessary because the
  same palette is to be used with both images. On the other hand,
  in this example there are two image dimension records (one per
  group), even though one would suffice.

* Group status does not alter the fundamental role of HDF objects.
  They are still accessible as individual data objects, despite the
  fact that they also belong to raster image sets. In a very real
  sense, the individual data elements are in the file, whether or
  not there are groups that contain them.

  RIGs provide an index showing what sets exist and what their
  members are. There is nothing to prevent the imposition of other
  groupings (indexes) that provide a different view of the same
  collection of data objects. In fact, HDF is designed to encourage
  the addition of alternate views, when appropriate.

Raster Image Sets

The raster image set (RIS) provides a framework for storing images
and any number of optional image descriptors. It provides for a
description of the image data layout, with the optional presence
of color look-up tables, aspect ratio, color correction, associated
matte or other overlay information, or any other data related to
the display of the image.

Raster Image Groups (RIGs)

Tying everything together is the raster image group (RIG), examples
of which were given earlier (Figure 4.1) A RIG contains a list of
data identifiers that point in turn to the data objects that
describe and make up the image.

The number of entries in a RIG is variable and the presence of most
of the description information is optional. Complex applications
can store data identifiers of image-modifying data, such as the
color table and aspect ratio, in the RIG along with the reference
to the image data itself. Simple applications can use simple
application level calls and ignore specialized video production or
film color correction parameters.

NCSA currently supports two calling interfaces, RIS8 and RIS24,
defined for the easy storage and retrieval of raster images using
RIGS. These interfaces are documented in the manual NCSA HDF
Calling interfaces and Utilities

Tags for Raster Image Sets

The tags presented in Table 4.1 must be fully supported by any
raster image set implementation.

Table 4.1  Tags for Raster Image Sets

Tag            Contents of Data Element

RIG            raster image group
ID             image dimension record
RI             raster image data

With full support for the above tags, images can be stored and read
from HDF files at any bit depth, with several different component
ordering schemes. As illustrated in Fig. 4.1, the RIG tag points
to a collection of the tag/refs that make up the RIG. The ID data
element identifies the dimensions of the image, the number type of
the elements that make up its pixels, the number of elements per
pixel, the interlace scheme used and the compression scheme used,
if any. The RI data element contains the actual raster image data.

*** INSERT FIGURE HERE ***

In addition to the required tags that define an image dataset, the
tags listed in Table 4.2 define color properties and other image
features. These tags are described fully in Appendix A.

Table 4.2  Additional Tags for Raster Image Sets

Tag            Contents of Data Element
XYP            XY position of image
LD             look-up table dimension record
LUT            color look-up table for non true-color Images
MD             matte channel dimension record
MA             matte channel data
CCN            color correction factors
CFM            color format designation
AR             aspect ratio
MTO            machine-type override

Fig. 4.2 illustrates the storage of a RIS that contains an image
palette (IP8), in addition to the required tags.

*** INSERT FIGURE HERE ***

Compression of Raster Images

Tags for two types of compression have been defined for raster
images. They are run-length encoding (RLE) and IMCOMP aerial
averaging (IMC). Others may be added at any time. Each encoding tag
is documented under its specific tag type (see Appendix A). Support
for RIG and RI does not require that all of the compression tag
types be supported. If you find an unknown compression type,
provide a suitable error message to the user.

Scientific Datasets

The scientific dataset (SDS) provides a framework for storing
multidimensional arrays of data, together with descriptive
information about the data. Current specifications support the
following types of numbers in SDS arrays.

* 8-bit, 16-bit and 32-bit signed and unsigned integers
* 32-bit and 64-bit floating point numbers

SDS numbers can be stored either as IEEE Standard integers or
floats or in the format used by the machine from which they were
written ("native mode").

Rank and dimension sizes may vary. A user interface exists for
storing and retrieving SDS. See the NCSA HDF manual for details.

Internal structures

For reasons having to do with backward compatibility, the group
structure that HDF uses for SDS is complicated. HDF 3.1 and
previous versions only supported 32-bit IEEE floating-point numbers
and Cray floating point numbers in' scientific data sets. HDF 3.2
and later releases support 8-bit, 16-bit, and 32-bit signed and
unsigned integers, and 32-bit and 64-bit floating-point numbers.
It also allows data sets to be written to HDF files in the local
machine format ("native mode"). Furthermore, it is anticipated that
later versions of HDF will support new number types and other
variations in the physical storage of scientific data, such as
compressed data.

The internal structure used to store SDS in HDF 3.1 and earlier
versions was not adequate to support the anticipated future changes
to SDS. A new structure had to be developed. At the same time, it
was important to try to retain compatibility with earlier versions
of the HDF library. Earlier versions of the library should be able
to read SDS written by HDF 3.2, if the SDS is "understandable" by
that earlier software, i.e. if the number type of the data is 32-
bit IEEE floating point or Cray floating point. Likewise, new
libraries (HDF 3.2 and beyond) should be able to recognize SDS
written by earlier versions of the library.

This compatibility is achieved by examining every SDS that is
written to an HDF file. If the SDS is compatible with older
libraries, it is written to the file using the old structure used
to represent SDS, as well as the new structure. If it is not
compatible with older libraries, only the newer structure is used.

The old structure for storing SDS is called SDG ("scientific data
group"). The newer structure is called NDG ("numeric data group").
Hence, SDS user interfaces in HDF3.2 and beyond handle three types
of numerical data groups:

1. SDG-created by old libraries and containing floating-point data.

2. NDG-created by the new library and containing non-floating-point
   data. This data group should not be recognized by old libraries.

3. SDG-like NDG-created by the new library and containing IEEE
   32-bit floating-point data only. The old libraries should be
   able to recognize and interpret this kind of numerical data
   groups correctly.


In the following sections, we described the SDG and NDG grouping
structures.

SDG structure

	Scientific datasets represented internally by the SDG tag must
always contain at least the data objects listed in Table 4.3.

Table 4.3  Required Tags for SDG

Tag            Contents of Data Element

SDG            scientific data group

SDD            scientific data dimension record for array-
               stored data. It includes the rank (number of
               dimensions) the size of each dimension, the
               tag/ref's representing the number types of the
               array-stored data and of each dimension.

               In the case of SDG, the number types are all
               32-bit IEEE floating-point values.

SD             scientific data

The data objects presented in Table 4.4 are optional.
NCSA's SDS user interface supports these objects

Table 4.4  Optional Tags for SDG

Tag            Contents of Data Element

SDS            scales along the different dimensions to be
               used when interpreting or displaying the data
               (must be of type float32).

SDL            labels for all dimensions and for the data.
               Each of the dimension labels can be interpreted
               as an independent variable, and the data label
               as the dependent variable.

SDU            units for all dimensions and for the data.

SDF            format specifications to be used when
               displaying values of the data.

SDM            maximum and minimum values of the data (must be
               of type float32).

SDC            coordinate system to be used when interpreting
               or displaying the data.

As illustrated in Fig. 4.3, the SDG tag points to a collection of
the tag/refs that make up the SDG.

*** INSERT FIGURE HERE  ***


NDG structure

SDS represented internally by the NDG tag must always contain at
least the data objects listed in Table 4.5

Table 4. 5  Required Tags for NDG

Tag            Contents of Data Element

NDG            Numerical data group

SDD            Scientific data dimension record for array-
               stored data. It includes the rank (number of
               dimensions), the size of each dimension, the
               tag/ref's representing the number types of the
               array-stored data and of each dimension.

               In HDF 3.2 , the number types of dimension
               scales are forced to be the same as the array-
               stored data, but in later implementations each
               dimension scale will be allowed its own type.

SD             Scientific data.

NT             Number type of the data set. Default of NT is
               the value most recently set by DFSDsetNT(). If
               no DFSDsetNT() was called previously, the
               default will be set as floating-point.

The data objects presented in Table 4.6 are optional. NCSA's SDS
user interface in HDF 3.2 and later versions supports these
objects. Other optional objects can be added at any time.

Table 4.6  Optional Tags for NDG, HDF 3.2.

Tag            Contents of Data Element

SDS            scales along the different dimensions to be
               used when interpreting or displaying the data..

SDL            labels for all dimensions and for the data.
               Each of the dimension labels can be interpreted
               as an independent variable, and the data label
               as the dependent variable.

SDU            units for all dimensions and for the data.

SDF            format specifications to be used when
               displaying values of the data.

SDM            maximum and minimum values of the data.

SDC            coordinate system to be used when interpreting
               or displaying the data.

As illustrated in Fig. 4.4, the NDG is identical to the SDG, except
that the NDG tag is different. This insures that older (pre-HDF
3.2) software cannot recognize this form of SDS.

*** INSERT FIGURE HERE ***

SDG-like NDG structure

An SDS written by HDF 3.2 or later that is compatible with earlier
SDS is represented internally by both an SDG and an NDG. Table 4.7
lists the objects that this group must always contain.

Table 4.7  Required Tags for NDG structure that is compatible with
SDG structure

Tag            Contents of Data Element

NDG            Numerical data group

SDG            Scientific data group

SDLNK          The NDG and SDG linked to the scientific data
               set in this group.

SDD            Scientific data dimension record for array-
               stored data. It includes the rank (number of
               dimensions), the size of each dimension, the
               tag/ref's representing the number types of
               the array-stored data and of each dimension.

               In an SDG-like NDG the number types are all
               32-bit IEEE floating-point values.

SD             Scientific data

*** INSERT FIGURE HERE ***


Compatibility with future NDG structures

It is likely that future versions of SDS will support optional
features that are not supported by the current version. These
features fall into two general categories:

* optional-compatible features: optional features that are
  compatible with older versions of HDF even though they may not
  be supported by older versions of HDF.

* For example, suppose a new attribute such as a time stamp, is
  added to SDS. Such an attribute would not be "understood" by
  older libraries, but it would not render the SDS data unreadable
  by the older libraries.

* Optional-incompatible features: optional new features that might
  not be compatible with older versions of HDF in the sense that
  they could render the data unreadable by older HDF libraries.

  For example, suppose compression is added to SDS. Since some
  older HDF libraries contain no compression routines, they would
  not be able to read the compressed data correctly.

The scheme that has been developed to address this problem involves
numbering conventions for tags. The following conventions are used:

* Required tags. These tags are described in Tables 4.4 and 4.5.
  All SDS must contain all of the tags in at least one of these
  sets.

* Optional-compatible tags. These tags can have any valid tag
  number except those in the other two categories.

* Optional-incompatible tags. A range of tags is defined for SDS
  features that might render the dataset unreadable by older
  versions of the library. This range has been specified as tag
  numbers 780-799.

Vsets and vdatas

An HDF Vset is a logical grouping of HDF data objects within an
HDF file. Data organization within the file resembles the UNIX file
system in that it is basically hierarchical in structure and also
allows cross-linking of data objects. Unlike Scientific Data Sets
and Raster Image Sets, Vsets have no prespecified content or
structure. Users can use them to create structural relationships
among HDF objects according to their needs. Figure 4.6 illustrates
a Vset.

*** INSERT FIGURE HERE ***

A Vset is represented by a vgroup, an HDF object that contains
information about the members of the Vset. The vgroup tag is
VGDESCTAG. The VGDESCTAG record contains a list of the data
identifiers of its members, an optional user-specified name, an
optional user-specified class, and some fields that enable it to
be extended to contain more information. The VGDESCTAG is described
fully in Appendix A. A full treatment of Vsets can be found in the
manual "NCSA HDF Vset, Version 2.0".

An HDF object that is often used in connection with Vsets is the
vdata. A vdata is a table. The data in a vdata is organized into
fields. Each field is identified by a unique fieldname. The type
of each field may be any of the data types supported by the SDS
interface: 8-, 16-, and 32-bit integers (signed or unsigned), and
32- and 64-bit floats. Several fields of different types may exist
within a vdata. appendix A contains full descriptions of the vdata
tags (VSDESCTAG and VSDATATAG). A full treatment of vdatas can be
found in the manual "NCSA HDF vset, Version 2.0".

Chapter Appendix: The Raster-8 Set

The raster image set (RIS), as described above, is the set
currently supported by HDF for managing raster images. Before the
RIS was added to HDF, a simpler, less flexible set called the
raster-8 set was used for storing 8-bit raster images. This set is
no longer supported in the HDF software, although it may turn up
in some older HDF files. In fact, during the first three years that
RIS was used, the HDF software stored raster images in both RIS
and raster-8 sets.

Raster-8 Sets

The raster-8 set is a set of tags that provide the basic
information necessary to store 8-bit raster images in a data file
and display them accurately without prompting the user to supply
dimensions or color information. The raster-8 set consists of the
tags presented in Table 4.8.

Table 4.8  Tags for Raster-8 Sets

Tag            Contents of Data Element

RI8            eight-bit raster image data

CI8            eight-bit raster image data compressed with
               run-length encoding

II8            IMCOMP compressed image data

ID8            Image dimension record

IP8            Image palette data

If you develop software for processing raster-8 sets, it must
support RI8, ID8, and IP8. If you do not implement CI8 or II8, then
be sure to provide appropriate error indicators to higher layers
that might expect to find these tags.

Compatibility between Raster-8 and Raster Image Sets

In order to maintain backward compatibility with raster-8 sets,
raster image set interface has stored tag/refs for both types of
sets in HDF raster image files. For example, if an image is stored
as part of a raster image set, there was one copy each of the image
dimension data, image data, and palette data stored, but there were
two sets of tag/refs pointing to each data element, one from each
set. The image data, for instance, was associated with tag RI8 and
RI.

NOTE: Although this policy is continued in the current release (HDF
3.2), future plans call for phasing out the use of the raster-8
structure. Therefore, future software should not expect to find
both raster-8 and RIS structures supporting 8-bit raster images.
Only RIS structures will eventually be used exclusively.


Chapter 5      Annotations

               Chapter Overview
               Types of Annotations
               File Annotations
               Object Annotations
               Getting Reference Numbers for Object Annotations


Chapter Overview

This chapter introduces and describes HDF objects that can be used
to annotate HDF files and HDF objects..

Types of Annotations

It is often useful to associate in text form information about an
HDF file and its data contents, and to keep that information in the
same file that contains the data. HDF provides this capability in
the form of annotations. An HDF annotation is a sequence of ASCII
characters that is associated its one of three types of objects:
(1) the file itself, (2) the individual HDF data objects in the
file, or (3) the tags that identify the data elements. The current
annotation interface supports only the first two types of
annotation. This interface is described in detail in the manual
NCSA RDF Calling Interfaces and Utilities..

Annotations are optionally supplied by a creator or user of an HDF
file or data object. Annotations come in two forms: labels, which
normally consist of short strings of characters, and descriptions,
which can be long and complex bodies of text.

Table 5.1 shows the types of annotations currently defined for HDF
files and their tag names.

Table 5.1  HDF Annotation tags

                           "Label"        "Description"

File Annotations             FID               FD
Object Annotations           DIL              DIA
Tag Annotations              TID               TD

File Annotations

Any HDF file can have labels (FID) and descriptions (FD)stored in
them.. There are routines in the annotations interface specifically
designed for reading and writing file IDs and file descriptions.
Specifications for the tags FID and FD are given in Appendix A.

Object Annotations

The annotation of HDF data objects is complicated by the fact that
you have to uniquely identify the objects being annotated. Since
a data identifier (tag/ref) for a data object uniquely identifies
that object, the data object that a particular annotation refers
to can be identified by storing the object's tag and reference
number together with the annotation.

Note that an RDF annotation is itself a data object, so it has its
own DD. This DD has a tag and a ref. number, and it points to the
"data" that constitutes the annotation. The "data" that goes with
an annotation consists of three things: (1) the tag of the object
that it is an annotation for, (2) the ref of the object that it is
an annotation for, and (3) the annotation itself.

For example, suppose you have an HDF file that contains three
scientific datasets (SDS). Each SDS has its own DD consisting of
the SDS tag DFTAG-STG, and a unique reference number as illustrated
in Figure 5.1.

*** INSERT FIGURE HERE ***

Suppose you wish to annotate the second SDS by storing the
following annotation with it in the file: "Data from black hole
experiment 8/18/87." This text would be stored in an HDF file as
an annotation, and it would have stored with it the tag DFTAG-SDG
and reference number 4. Figure 5.2 illustrates how the annotation
would look in the file.

*** INSERT FIGURE HERE ***

Getting Reference Numbers for Object Annotations

Note that in order to use annotation routines, you need to know
the tags and reference numbers of the objects you wish to annotate.

Special routines are available for obtaining the reference numbers
of certain tags, including tags for SDSs, Raster Image Sets,
palettes, and annotations. These are: DFSD1astref, DFR81astref,
DFP1astref, and DFAN1astref. They return the most recent reference
number used in either reading or writing the corresponding data
object. Reference numbers for objects other than these can be
obtained with the routine Hfindnextref, a general purpose HDF
routine that searched through an HDF file for reference numbers
that go with a given tag. These routines are described and
illustrated in the manual "NCSA HDF Calling Interfaces and
Utilities."


Chapter 6      NCSA HDF Tags

               Chapter Overview
               The HDF Tag Space
               Physical Storage Methods
               Specifications of Supported Tags


Chapter Overview

This chapter addresses issues related to HDF tags and the data they
represent. The first section discusses some general information
about tags and their interpretation. The remainder of the chapter
contains a complete list of HDF tags that have been assigned by
NCSA as of version 3.2 of the library and a detailed discussion of
their specifications.

The HDF Tag Space

As discussed in the chapter entitled "The Basic Structure of HDF
Files," there are 16 bits allotted to an HDF tag number, providing
for 65535 possible tags ranging from 1 to 65535, with zero (0)
unused. This tag space is broken down into three ranges as shown
below.

           1--32767 reserved for NCSA-supported tags
       32768--64999 user-definable
       65000--65535 reserved for expansion of the format

No restrictions are placed on the user-definable tags, but it
should be noted that tags from this range cannot be guaranteed to
be unique across all user-developed HDF applications. The rest of
this chapter will be devoted to the NCSA-supported tags in the
range 1 to 32767.

Physical Storage Methods

In previous versions of HDF, each data element was required to
occupy one contiguous block of space in a single file. But,
beginning with HDF Version 3.2, a mechanism was added to support
different methods of physical storage of data elements. The new
mechanism is called the "extended tag."

Any of the NCSA standard tags can take advantage of the new
features of the extended tags. Extended tags are automatically
recognized by the library and interpreted according to a
description record. The description record is a complete data
element unto itself which identifies the type of extended element
and provides the relevant parameters for retrieval of that element.
Currently, there are two types of extended tags, both of which
offer alternate methods of physical storage: linked block elements
and external elements.

Linked Block Elements

Linked block elements provide a convenient way of adding data to
a pre-existing element. They consist of a series of blocks of data
chained together in a linked list (similar to the DD list). In
general, the data blocks are of a uniform size. However, the first
block is considered a special case and is allowed to have a
different size from the rest of the blocks.

The description record for a linked block element begins with the
constant EXT_LINKED, which identifies the linked block storage
method. It also contains information about the organization of the
linked block element as a whole. Figure 6.1 shows a diagram of a
description record for a linked block element.

*** INSERT FIGURE HERE ***

<extended tag>           any NCSA standard tag converted to an
                         extended tag (16-bit integer)
<ref no>                 reference number (16-bit integer)
EXT_LINKED               constant identifying this as a linked
                         block description record (32-bit integer)
<length>                 length of entire element (32-bit integer)
<first len>              length of the first data block (32-bit
                         integer)
<blk len>                length of successive data blocks (32-bit
                         integer)
<num blk>                number of blocks per block table (32-bit
                         integer)
<link ref>               reference number of first block table
                         (16-bit integer)

The <link ref> field of-the description record gives the reference
number of the first linked block table for the element. This table
is identified by the tag DFTAG_LINKED and contains <num blk>
entries. There may be any number of linked block tables chained
together to describe a linked block element. Figure 6.2 shows a
diagram of a linked block table.

*** INSERT FIGURE HERE ***

<link ref>               reference number for this block table
                         (16-bit integer)
<next ref>               reference number for next block table
                         (16-bit integer)
<blk ref n>              reference number for data block (16-bit
                         integer)

The <next ref> field contains the reference number of the next
linked block table. A value of zero (0) in this field indicates
that there are no additional linked block tables associated with
this linked block element.

The <blk ref n> fields of each linked block table contain reference
numbers for the individual data blocks that make up the data
portion of the linked block element. These data blocks are also
identified by the tag DFTAG_LINKED as shown in Figure 6.3. Although
it may seem ambiguous to use the same tag to refer to two different
objects, this ambiguity is alleviated by the context in which the
tags appear.

*** INSERT FIGURE HERE ***

<blk ref n>              reference number for this data block
                         (16-bit integer)
<data block>             block of actual data (size given by
                         <first len> or <blk len> from the
                         description record)

Linked block elements can be created using the function HLcreate(),
which is discussed in detail in the chapter "The NCSA HDF General
Purpose Interface."

External Elements

External elements allow the data portion of an HDF element to
reside in a separate file. The potential of external data elements
is largely unexplored in the HDF context, although other file
formats (most notably CDF) have used external data elements
apparently to great advantage.

Because there has been little discussion of external elements
within the HDF user community, the structure of these elements is
still not completely defined. Figure 6.4 shows a diagram of the
proposed structure for an external element.

*** INSERT FIGURE HERE ***


<extended tag>           any NCSA standard tag converted to
                         an extended tag (16-bit integer)
<re no>                  reference number (16-bit integer)
EXT_EXTERN               constant identifying this as an external
                         element description record (16-bit
                         integer)
<offset>                 location of the data within the external
                         file (32-bit integer)
<length>                 length in bytes of the data in the
                         external file (32-bit integer)
<filename>               non-null terminated ASCII string
                         containing the name of the external file
                         in which the data resides (any length)

The description record for an external element begins with the
constant EXT_EXTERN, which identifies the external storage method.
It also contains information  about how to find the element.

External elements can be created using the function HXcreate() ,
which is discussed in detail in the chapter "The NCSA HDF General
Purpose Interface."

Specifications of Supported Tags

The following pages contain the specifications of all the tags that
are officially supported as of HDF version 3.2. Each entry is to
be interpreted as follows:

* The word id capital letters on the left is the tag name.

* The three short lines at the beginning of each description
  uniquely identify the tag:

  The first line is the full name of the tag.

  The second line describes the type and (where possible) the
  amount of data in the corresponding data element. When the data
  element is a variable-sized data structure-such as text, a
  string, or a variable-sized array-the amount of data cannot be
  specified exactly. Where possible, a formula is given for
  estimating the amount of data. If the second line is "? bytes,
  it means that neither the size nor the structure of the data
  element can be specified.

  The third line gives the tag number in decimal and (hexadecimal).

* Next is a diagram showing, as nearly as possible, the structure
  of the tag and its associated data.

* Finally, a full specification of the tag is presented, including
  a description of the data element and a discussion of its
  intended use.

These listings are grouped approximately according to the roles
that the tags play under the headings Utility Tags, Annotation
Tags, Raster Image Tags, and so forth. These groupings imply a
general context for the use of each tag, but are not meant to
restrict the use of the tags to any particular context.

Please note that the subsection under the heading Obsolete Tags
contains the specifications for tags that have fallen out of use
with the continuing development of HDF. These tags are still
recognized by the HDF library, but it is not recommended that users
write out new objects using these tags, since some of them may
eventually be dropped from the HDF specification.

Utility Tags

DFTAG_NULL

No data
0 bytes
1  (0X0001)

*** INSERT FIGURE HERE ***

<ref no>  reference number (16-bit integer; always 0)

This tag is used for place holding and to fill empty portions of
the data description block. The length and offset fields (not
shown) of a NULL DD must be equal to zero.

DFTAG_VERSION

Library version number
12 bytes plus the length of a string
30 (0x001E)

*** INSERT FIGURE HERE ***

<ref no>  reference number (16-bit integer)
<major>   Major version number (32-bit integer)
<minor>   minor version number (32-bit integer)
<release> release number (32-bit integer)
<string>  non-null terminated ASCII string (any length)

The data portion of this tag gives the complete version number and
a descriptive string for the latest version of the HDF library to
write to the file.

DFTAG_NT

Number type
4 bytes
106 (0x006A)

*** INSERT FIGURE HERE ***

<ref no>  reference number (16-bit integer)
<version> version number of NT information (8-bit integer)
<type>    unsigned int, signed int, unsigned char, char, float,
          double (8-bit code)
<width>   number of bits (assumed all significant) (8-bit code)
<class>   a generic value, with different interpretations depending
          on type: floating-point, integer, or character (8-bit
          code)

Some possible :values that may be included for each of the three
types in the field CLASS are listed in Table 6.1.

Table 6.1      Number Type Values

Type      Possible Values

floats    DFNTF_NONE
          DFNTF_IEEE
          DFNTF_VAX
          DFNTF_CRAY
          DFNTF_PC
          DFNTF_CONVEX

ints      DFNTI_MBO
          DFNTI_IBO
          DFNTI-VBO

chars     ASCII
          EBCDIC, BYTE

The number type flag is used by any other element in the file to
indicate specifically what a numeric value looks like other tag
types should contain a reference number pointer to an DFTAG_NT
instead of containing their own number type definitions.

The version field allows expansion of the number type information,
in case some future number types cannot be described using the
fields currently defined. Successive versions of the DFTAG_NT may
be substantially different from the current definition, however,
backward compatibility will be maintained. The current DFTAG_NT.
version number is 1.

DFTAG_MT

Machine type
0 bytes
107 (0x006B)

*** INSERT FIGURE HERE ***

<double>  specifies method of encoding double precision floating
          point (4-bit code)
<float>   specifies method of encoding single precision floating
          point (4-bit code)
<int>     specifies method of encoding integers (4-bit code)
<char>    specifies method of encoding characters (4-bit code)

The DFTAG_MT specifies that all unconstrained or partially
constrained values in this HDF file are of the default type for
that hardware. When the DFTAG_MT is set to VAX, for example, all
integers will be assumed to be in VAX byte order unless
specifically defined otherwise with a DFTAG NT. Note that all of
the headers and many tags, the whole raster image set for example,
are defined with bit-wise precision and will not be overridden by
the DFTAG_MT setting.

For DRTAG_MT, the reference field itself is the encoding of the
DFTAG_MT information. The reference field is 16 bits, taken as four
groups of four bits, specifying the types for double, float, int
and char respectively. This allows 16 generic specifications for
each type.

To the user, these will be defined constants in the header file
hdf.h, specifying the proper descriptive numbers for Sun, VAX,
Cray, Convex, and other computer systems. If there is no DFTAG_MT
in a file, the application may assume that the data in the file has
been written on the local machine--assuming any portability
problems are taken care of by the user. For this reason, we
recommend that all HDF files contain a DFTAG_MT for maximum
portability.


Possible data encodings are shown in Table 6.2.

Table 6.2  Possible Machine Types

Type           Possible Encodings

double         IEEE64, VAX64, CRAY128
floats         IEEE32, VAX32, CRAY64
ints           VAX32, Intell6, Intel32, Motorola32, CRAY64
chars          ASCII, EBCDIC

New encodings can be added for each data type, as the need arises.

DFTAG_FID

File identifier
string
100 (0x0064)

*** INSERT FIGURE HERE ***

<ref no>            reference number (16-bit integer)
<character string>  non-null terminated ASCII text (any length)

This tag points to a string which the user wants to associate with
this file. The string is not null terminated. The string is
intended to be a user-supplied title for the file.

DFTAG_FD

File description
text
101 (0x0065)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<text block>   non-null terminated ASCII text (any length)

This tag points to a block of text describing the overall file
contents. The text can be any length. The block is not null
terminated. The text is intended to be user-supplied comments about
the file.

DFTAG_TID

Tag identifier
string
102 (0x0066)

*** INSERT FIGURE HERE ***

<tag>               tag number to which this tag refers (16-bit
                    integer)
<character string>  non-null terminated ASCII text (any length)

The data for this tag is a string that identifies the functionality
of the tag indicated in the space normally used for the reference
number. For example, the tag identifier for DFTAG_TID might point
to data that reads "tag identifier."

Many tags are identified in the HDF specification, so it is usually
unnecessary to include their identifiers in the HDF file. But with
user-defined tags or special-purpose tags, the only way for a human
reader to diagnose what kind of data is stored in a file is to read
tag identifiers. Use tag descriptions to define even more detail
about your user-defined tags.

Note that with this tag you may make use of the user-defined tags
to check for consistency. Although two persons may use the same
user-defined tag, they probably will not use the same tag
identifier.

DFTAG_TD

Tag description
text
103 (0x0067)

*** INSERT FIGURE HERE ***

<tag>          tag number to which this tag refers (16-
               bit integer)
<text block>   non-null terminated ASCII text (any length)

The data for this tag is a text block which describes in relative
detail the functionality and format of the tag which is indicated
in the space normally occupied by the reference number. This tag
is mainly intended to be used with user-defined tags and provides
a medium for users to exchange files that include human-readable
descriptions of the data.

It is important to provide everything that a programmer might need
to know to read the data from your user-defined tag. At the
minimum, you should specify everything you would need to know in
order to retrieve your data at a later date if the original program
were lost.

DFTAG_DIL

Data identifier label
string
104 (0x0068)

*** INSERT FIGURE HERE ***

<ref no>            reference number (16-bit integer)
<obj tag>           tag number of the data to which this label
                    applies (16-bit integer)
<obj ref no>        reference of the data to which this label
                    applies (16-bit integer)
<character string>  non-null terminated ASCII text (any length)

The data for this tag is a data identifier, made up of a tag and
reference number, followed by a string that the user wants to place
in the file. The purpose of this tag is to associate the string
with the data identifier as a label for whatever that data
identifier refers to in turn.

By including DFTAG_DILs, you can give a data object a label for
future reference. For example, DFTAG_DIL is often used to give
titles to images.

DFTAG_DIA

Data identifier annotation
text
105 (0x0069)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<obj tag>      tag number of the data to which this annotation
               applies (16-bit integer)
<obj ref no>   reference of the data to which this annotation
               applies (16-bit integer)
<text block>   non-null terminated ASCII text (any length)

The data for this tag is a data identifier, which is made up of a
tag and a reference number, followed by a text block that the user
wants to place in the file. Its purpose is to associate the text
block with the data identifier as an annotation for whatever that
data identifier points to in turn.

With DFTAG_DIA, any data object can have a lengthy, user-written
description of why that data is in the file. This can be used to
include user comments about images, datasets, source code, and so
forth.

Compression Tags

DFTAG_RLE

Run length encoded data
0 bytes
11 (0X000B)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

This tag is used in the compression field of a DFTAG_ID and other
places to indicate that an image or section of data is encoded with
a run-length encoding scheme. The RLE method used is byte-wise.
Each run is preceded by a count byte. The low seven bits of the
count byte indicate the number of bytes (n). The high bit of the
count byte indicates whether the next byte should be replicated n
times (high bit=1), or whether the next n bytes should be included
as is (high bit=0).

See also: DFTAG_ID (General Raster Image Tags)
          DFTAG_NDG (Scientific Dataset Tags)

DFTAG_IMC

IMCOMP compressed data
0 bytes
12 (0X000C)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

This tag is used in the ID compression field and other places to
indicate that an image or section of data is encoded with an IMCOMP
encoding scheme. This scheme is a 4:1 aerial averaging method which
is easy to decompress. It counts color frequencies in 4x4 squares
to optimize color sampling.

See also: DFTAG_ID (General Raster Image Tags)
          DFTAG_NDC (Scientific Dataset Tags)

DFTAG_JPEG

24-bit JPEG compression information
? bytes
13 (0X000D)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

This tag points to header information for 24-bit JPEG compressed
images. The data in this tag is identical to the data stored in a
JFIF (JPEG File Interchange Format) file up to the Start-of-Frame
parameter (see the JFIF format document for further details). The
Start-of-Frame parameter and all further data for the JPEG image
is stored the in associated DFTAG_CI data element which is the
companion to the DFTAG_JPEG element.

DFTAG_GREYJPEG

8-bit JPEG compression information
? bytes
14 (0X000E)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

This tag points to header information for 8-bit JPEG compressed
images. The data in this tag is identical to the data stored in a
JFIF (JPEG File Interchange Format) file up to the Start-of-Frame
parameter (see the JFIF format document for further details). The
Start-of-Frame parameter and all further data for the JPEG image
is stored the in associated DFTAG-CI data element which is the
companion to the DFTAG-JPEG element.

General Raster Image Tags

DFTAG_RIG

Raster image group
n*4 bytes (where n is the number of data objects in the group.)
306 (0x0132)

*** INSERT FIGURE HERE ***

<ref no>        reference number (16-bit integer)
<tag n>         tag number for nth member of the group (16-bit
                integer)
<ref n>         reference number for nth member of the group
                (16-bit integer)

The raster image group (RIG) data is a list of data identifiers
(tag/ref) that describe a raster image. All of the members of the
group are required in order to display the image correctly.
Application programs that deal with RIGs should read all the
elements of a RIG and process those identifiers which it can
display correctly. Even if the application cannot process all of
the tags, the tags that it can process will be usable.

Tag types that may appear in a RIG are listed in Table 6.3.

Table 6.3  Possible Tag Types in an RIG

Tag            Description

DFTAG_ID       Image dimension
DFTAG_RI       raster image
DFTAG_XYP      X-Y position
DFTAG_LD       LUT dimension
DFTAG_LUT      color lookup table
DFTAG_MD       matte channel dimension
DFTAG_MA       matte channel
DFTAG_CCN      color correction
DFTAG_CFM      color format
DFTAG_AR       aspect ratio

Example

ID, RI, LD, LUT

An image dimension record, the raster image, an LUT dimension and
the LUT go together. The application reads the image dimensions,
then reads the image with those dimensions. It also reads the
lookup table according to its dimensions and displays the
corresponding image.

DFTAG_ID,           DFTAG_LD,           DFTAG_MD

Image dimension     LUT dimension       Matte dimension
20 bytes            20 bytes            20 bytes
300 (0x012C)        307 (0x0133)        308 (0x0134)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<x dim>        length of x (horizontal) dimension (32-bit integer)
<y dim>        length of y (vertical) dimension (32-bit integer)
<NT ref>       reference number of number type information for
               associated object
<elements>     number of elements that comprise one entry (16-bit
               integer)
<interlace>    defines type of interlacing used (16-bit integer)
<comp tag>     tag which tells the type of compression used and any
               associated parameters (16-bit integer)
<comp ref>     reference number of compression tag (16-bit integer)

The three dimension records have exactly the same format. They
define the dimensions of the 2D array to which they refer. The
diagram above pictures a DFTAG_ID for illustration. A DFTAG_ID
specifies the dimensions of a DFTAG_RI, DFTAG_LD specifies the
dimensions of a DFTAG_LUT, and DFTAG_HD specifies the dimensions
of a DRTAG_MA.

For example, a 512x256 row-wise 24-bit raster image with each pixel
stored as RGB bytes would have the following values:

<x dim>:       512
<y dim>:       256
<NT ref>       UINT8
<elements>     3 (3 elements per pixel: e.g., R,G and B)
<interlace>    0 (RGB values not separated)
<comp tag>     0 (no compression is used)

DFTAG_RI

Raster image
xdim*ydim*elements*NTsize bytes (xdim, ydim, elements, and NTsize
are given by the corresponding DFTAG_ID)

302 (0x012E)

*** INSERT FIGURE HERE ***

<ref no>            reference number (16-bit integer)

This tag points to raster image data. It is stored in row-major
order and must be interpreted as specified in a DFTAG_ID:

<interlace>=0  means the components of each pixel are together.
<interlace>=1  means color elements are grouped by scan lines.
<interlace>=2  means color elements are grouped by planes.

DFTAG_LUT

Lookup table
xdim*ydim*elements*NTsize bytes (xdim, ydim, elements, and NTsize
are given by the corresponding DFTAG_ID)

301 (0x012D)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<Pn m>         Mth value of parameter n (size is given by the
               DFTAG_NT in the corresponding DFTAG_LD)

The DFTAG-LUT, sometimes called a palette, is used by many kinds
of hardware to assign colors to data values. When a raster image
consists of data values which are going to be interpreted through
hardware with a LUT capability, the DFTAG_LUT should be loaded
along with the image.

The most common lookup table is the RGB lookup table which will
have X dimension-256 and Y dimension-1 with three elements per
entry, one each for red, green, and blue. The interlace will be
either 0, where the LUT values are given RGB, RGB, RGB ..., or 1,
where the LUT values are given as 256 reds, 256 greens, 256 blues.

DFTAG_MA

matte channel
xdim*ydim*elements*NTsize bytes (xdim, ydim, elements, and NTsize
are given by the corresponding DFTAG_ID)

309 (0x0135)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

The DFTAG_MA contains transparency data which can be used to
facilitate the overlaying of images. The data consist of a
two-dimensional array of unsigned 8-bit integers ranging from 0 to
255. Each point in a DFTAG-MA indicates the transparency of the
corresponding point in a raster image of the same dimensions. A
value of 0 indicates that the data at that point is to be
considered totally transparent, while a value of 255 indicates that
the data at that point is totally opaque. It is assumed that a
linear scale applies to the transparency values, but users may opt
to interpret the data in any way they wish.

DFTAG_CCN

Color correction
52 bytes (usually)
310 (0x0136)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<gamma>        gamma parameter (32-bit IEEE float)
<red x/y/z>    red x/y/z correction factors (32-bit IEEE floats)
<green x/y/z>  green x/y/z correction factors (32-bit IEEE floats)
<blue x/y/z>   blue x/y/z correction factors (32-bit IEEE floats)
<white x/y/z>  white x/y/z correction factors (32-bit IEEE floats)

Color correction specifies the Gamma correction for the image and
color primaries for the generation of the image.

DFTAG_CFM

Color format
string
311 (0x0137)

*** INSERT FIGURE HERE ***

<ref no>            reference number (16-bit integer)
<character string>  non-null terminated ASCII string (any length)

The color format is a clue to how each element of each pixel in a
raster image can be interpreted. It is defined to be a string which
is in all caps, and is one of the values shown in Table 6.4.

Table 6.4  Color Format String Values

String         Description

VALUE          pseudo-color, or just a value associated with the pixel
RGB            red, green, blue model
XYZ	       color-space model
HSV            hue, saturation, value model
HSI            hue, saturation, intensity
SPECTRAL       spectral sampling method

DFTAG_AR

Aspect ratio
4 bytes
312 (0x0138)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<ratio>        ratio of width to height (32-bit IEEE float)

The data for this tag is the visual aspect ratio for this image.
The image should be visually correct if displayed on a screen with
this aspect ratio. The data consists of one floating-point number
which represents width divided by height. An aspect ratio of 1.0
indicates a display with perfectly square pixels; 1.33 is a
standard aspect ratio used by many monitors.

Composite Image Tags

DFTAG_DRAW

Draw
n*4 bytes (where n is the number of data objects that comprise the
composite image.)
400 (0x0190)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<tag n>        tag number of the nth member of the draw list
               (16-bit integer)
<ref n>        reference number of the nth member of the draw list
               (16-bit integer)

The data for this tag is a list of data identifiers (tag/ref pairs)
which define a composite image. Each member of the DRTAG_DRAW data
should be displayed, in order, on the screen. This can be used to
indicate several RIGs which should be displayed simultaneously, or
even include vector overlays, like DRTAG_T14, which should be
placed on top of a RIG.

Some of the elements in a DRAW list may be instructions about how
images are to be composited (XOR, source put, anti-aliasing, etc.).
These are defined as individual tags.

DFTAG_XYP

XY position
8 bytes
500 (0x01F4)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<x>            x-coordinate (32-bit integer)
<Y>            y-coordinate (32-bit integer)

A DFTAG_XYP is used in composites-and other groups to indicate an
XY position on the screen. For this, (0,0) is the lower left, X is
the number of pixels to the right along the horizontal axis and Y
is the number of pixels on the vertical axis. The X and Y pixel
dimensions are given as two 32-bit integers.

For example, if DFTAG_XYP is present inside a DFTAG_RIG, the
DFTAG_XYP refers to the position of the lower left corner of the
raster image on the screen.

See also:      DFTAG_DRAW (this section)

Vector Image Tags

DFTAG_T14

Tektronix 4014
? bytes
602 (0x25A)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

This tag points to a Tektronix 4014 data stream. The bytes in the
data field, when read and sent to a Tektronix 4014 terminal, will
display a vector image. Only the lower seven bits of each byte are
significant. There are no record markings or non-Tektronix codes
in the data.

DFTAG_T105

Tektronix 4105
? bytes
603 (0x25B)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

This tag points to a Tektronix 4105 data stream. The bytes in the
data field, when read and sent to a Tektronix 4105 terminal, will
be displayed as a vector image. Only the lower seven bits of each
byte are significant. Some terminal emulators will not correctly
interpret every feature of the Tektronix 4105 terminal, so you may
wish to use only a subset of the possible Tektronix 4105 vector
commands.

Scientific Dataset Tags

DFTAG_NDG

Numeric data group
n*4 bytes (where n is the number of data objects in the group.)
720 (0x02D0)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<tag n>        tag number of nth member of the group (16-bit
               integer)
<ref n>        reference number of nth member of the group (16-bit
               integer)

The numeric data group (NDG) data is a list of data identifiers
(tag/ref pairs) that describe a scientific dataset. It supercedes
the old DFTAG_SDG, which has been obsoleted as of version 3.2 of
the HDF library. A more complete explanation of the relationship
between DFTAG_NDG and DFTAG_SDG can be found in the chapter
entitled "Sets and Groups."

All of the members of the group provide information for correctly
interpreting and displaying the data. Application programs that
deal with NDGs should read all of the elements of a NDG and process
those identifiers which it can use. Even if an application cannot
process all of the tags, the tags that it can understand will be
usable.

Tag types that may appear in a DFTAG_NDG are listed in Table 6.5.

Table 6.5  Possible Tag Types in an NDG

Tag            Description

DFTAG_SDD      scientific data dimension record (rank and dimensions)
DFTAG_SD       scientific data
DFTAG_SDS      scales
DFTAG_SDL      labels
DFTAG_SDU      units
DFTAG_SDF      formats
DFTAG_SDM      maximum and minimum values
DFTAG_SDC      coordinate system
DFTAC_CAL      calibration information
DFTAG_FV       fill value
DFTAG_LUT      color lookup table
DFTAG_LD       lookup table dimension record
DFTAG_SDLNK    link to old-style DFTAG_SDG (See Sets and Groups)

Example

DFTAG_SDD, DRTAG_SD, DRTAG_SDM
A dimension record, the scientific data, and the maximum and
minimum values of the data go together. The application reads the
rank and dimensions from the dimension record, then reads the data
array with those dimensions. If it needs maximum and minimum, it
also reads them.

See also: Sets and Groups

DFTAG_SDD

Scientific data dimension record
6 + 8*rank bytes
701 (0x02BD)

*** INSERT FIGURE HERE ***

<ref no>            reference number (16-bit integer)
<rank>              number of dimensions (16-bit integer)
<dim n>             number of values along the nth dimension
                    (32-bit integer)
<data NT ref>       reference number of DFTAG_NT for data
                    (16-bit integer)
<scale NT ref n>    reference number for DFTAG-NT for the
                    scale for the nth dimension (16-bit
                    integer)

This record defines the rank and dimensions of the array in the
scientific dataset. For example, a DFTAG_SDD for a 500X600X3 array
of floating-point numbers would have the following values and
components.

Rank: 3
Dimensions: 500, 600, and 3.
One data NT
Three scale NTs

DFTAG_SD

Scientific data
NTsize*x*y*z* ... bytes (where NTsize is the size of the data NT
given by the corresponding DFTAG_SDD and x, y, z, etc. are the
dimension sizes)
702 (0x02BE)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

This tag points to an array of scientific data. The type of the
data may be specified by an DFTAG_NT included with the SDG. If
there is no DFTAG_NT, the type of the data is floating-point in
standard IEEE 32-bit format. The rank and dimensions must be stored
as specified in the corresponding DFTAG_SDD. The diagram above
shows a three-dimensional data array.

DFTAG_SDS

Scientific data scales
rank + NTsizeO*x + NTsize1*y +NTsize2*z +... bytes (where rank is
the number of dimensions, x, y, z, etc. are the dimension sizes,
and NTsize# are the sizes of each scale NT from the corresponding
DFTAG_SDD.)
703 (0x02BF)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<is n>         tells whether a scale exists for the nth dimension
               (8-bit integer; 0 or 1)
<scale n>      list of scale values for the nth dimension (type is
               given by corresponding DFTAG_SDD)

This tag points to the scales for the dataset. The first n bytes
indicate whether there is a scale for the corresponding dimension
(1=yes, 0=no). This is followed by the scale values for each
dimension. The scale consists of a simple series of values, where
the number of values and their types are given by the corresponding
DFTAG_SDD.

DFTAG_SDL

Scientific data labels
? bytes
704 (0x02C0)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<label n>      null terminated ASCII string (any length)

This tag points to a list of labels for the data and each dimension
of the dataset. Each label is a string terminated by a null byte
(0).


DFTAG_SDU

scientific data units
? bytes
705 (0x02C1)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<unit n>       null terminated ASCII string (any length)

This tag points to a list of strings specifying the units for the
data and each dimension of the dataset. Each unit's string is
terminated by a null byte (0).

DFTAG_SDF

Scientific data format
? bytes
706 (0x02C2)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<format n>     null terminated ASCII string (any length)

This tag points to a list of strings specifying an output format
for the data and each dimension of the dataset. Each format string
is terminated by a null byte (0).

DFTAG_SDM

Scientific data max/min
8 bytes
707 (0x02C3)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<max>          maximum value (type is given by the data NT in the
               corresponding DFTAG_SDD)
<min>          minimum value (type is given by the data NT in the
               corresponding DFTAG_SDD)

This record contains the maximum and minimum data values in the
dataset. The type of <max> and <min> are given by the data NT of
the corresponding DFTAG_SDD.

DFTAG_SDC

Scientific data coordinates
? bytes
708 (0x02C4)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<string>       null terminated ASCII string (any length)

This tag points to a string specifying the coordinate system for
the dataset. The string is terminated by a null byte.

DFTAG_SDLNK

Scientific dataset link
8 bytes
710 (0x02C6)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
DFTAG_NDG      NDG tag (16-bit integer)
<NDG ref>      reference number of NDG (16-bit integer)
DFTAG_SDG      SDG tag (16-bit integer)
<SDG ref>      reference number of SDG (16-bit integer)

The purpose of this tag is to link together an old-style DFTAG_SDG
and a DFTAG_NDG in cases where the NDG contains 32-bit floating
point data and is, therefore, equivalent to an old SDG. A complete
description of the use of this tag can be found in the chapter
entitled "Sets and Groups"

See also: Sets and Groups

DFTAG_CAL

Calibration information
36 bytes
731 (0x02DB)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<cal>          calibration factor (64-bit IEEE float)
<cal err>      error in calibration factor (64-bit IEEE float)
<off>          calibration offset (64-bit IEEE float)
<off err>      error in calibration offset (64-bit IEEE float)
<data type>    constant representing the effective data type of the
               calibrated data (32-bit integer)

This tag points to a calibration record for the associated
DFTAG_SD. The data can be calibrated by first multiplying by the
<cal> factor, then adding the <off> value. Also included in the
record are errors for the calibration factor and offset and a
constant indicating the effective data type of the calibrated data.
Possible values of <data type> are shown in Table 6.6.

Table 6.6  Possible calibrated data types

Data Type      Description

INT8           signed 8-bit integer
UINT8          unsigned 8-bit integer
INT16          signed 16-bit integer
UINT16         unsigned 16-bit integer
INT32          signed 32-bit integer
UINT32         unsigned 32-bit integer
FLOAT32        32-bit float
FLOAT64        64-bit float

DFTAG_FV

Fill value
? bytes (size given by size of data NT in corresponding DFTAG_SDD)
732 (0x02DC)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<fill value>   value representing unset data in the corresponding
               DFTAG_SD (size given by size of data NT in
               corresponding DFTAG_SDD)

This tag points to a value which has been used to indicate unset
values in the associated DFTAG_SD. The number type of the value
(and, therefore, its size) is given in the corresponding DFTAG_SDD.

Vset DFTAG_VG

Vgroup
14 + 4*nelt + namelen + classlen bytes (where nelt, namelen, and
classlen are given below)
1965 (0x07AD)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<nelt>         number of elements in the vgroup (16-bit integer)
<tag n>        tag of the nth member of the vgroup (16-bit integer)
<ref n>        reference number of the nth member of the vgroup
               (16-bit integer)
<namelen>      length of the name field (16-bit integer)
<name>         non-null terminated ASCII string (length given by
               <namelen>)
<classlen>     length of the class field (16-bit integer)
<class>        non-null terminated ASCII string (length given by
               <classlen>)
<extag>        extension tag (16-bit integer)
<exref>        extension reference number (16-bit integer)
<version>      version number of DFTAG_VG information (16-bit
               integer)
<more>         unused (2 zero bytes)

The DFTAG_VG provides a general-purpose grouping structure which
can be used to impose a hierarchical structure on the tags in the
group. Any HDF tag may be incorporated into a vgroup (including
other DFTAG_VGS).

For more information about Vsets, see the chapter entitled "HDF
Vsets"

DFTAG_VH

Vdata description
22 + 10*nfields + Sfldnmlen n + namelen + classlen bytes (where
nfields, fldnmlen n, namelen, and classlen are given below)
1962 (0x07AA)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<interlace>    constant indicating interlace scheme used (16-bit
               integer)
<nvert>        number of entries in vdata (32-bit integer)
<ivsize>       size of one vdata entry (16-bit integer)
<nfields>      number of fields per entry in the vdata (16-bit
               integer)
<type n>       constant indicating the data type of the nth field
               of the vdata (16-bit integer)
<isize n>      size in bytes of the nth field of the vdata (16-bit
               integer)
<offset n>     offset of the nth field within the vdata (16-bit
               integer)
<order n>      ??? of the nth field of the vdata (16-bit integer)
<fldnmlen n>   length of the nth field name string (16-bit integer)
<fldnm n>      non-null terminated ASCII string (length given by
               corresponding <fldnmlen>)
<namelen>      length of the name field (16-bit integer)
<name>         non-null terminated ASCII string (length given by
               <namelen>)
<classlen>     length of the class field (16-bit integer)
<class>        non-null terminated ASCII string (length given by
               <classlen>)
<extag>        extension tag (16-bit integer)
<exref>        extension reference number (16-bit integer)
<version>      version number of DFTAG_VH information (16-bit
               integer)
<more>         unused (2 zero bytes)

DFTAG_VE provides all the information necessary to process a
DFTAG_VS.

For more information on Vsets, see the chapter entitled "HDF
Vsets."

See also: DFTAG_VS (this section)

DFTAG_VS

Vdata
nvert * Sisize n bytes (where nvert, and isize n are given by the
corresponding DFTAG_VH)
1963 (0x07AB)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<vdata>        data block interpreted according to the
               corresponding DFTAG_VH (nvert * Sisize n bytes,
               where nvert, and isize are given by the
               corresponding DFTAG_VH)

DFTAG_VS contains a block of data which is to be interpreted
according to the information in the corresponding DFTAG_VR.

For more information on Vsets, see the chapter entitled "HDF
Vsets."

See also: DFTAG_VE (this section)

Obsolete Tags

DFTAG_ID8

Image dimension-8
4 bytes
200 (0x00C8)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<x dim>        length of x dimension (16-bit integer)
<y dim>        length of y dimension (16-bit integer)

The data for this tag consists of two 16-bit integers representing
the width and height of an 8-bit raster image in bytes.

This tag has been superceded by DFTAG_ID.

DFTAG_IP8

Image palette-8
768 bytes
201 (0x00C9)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
table entries  256 triples of 8-bit integers.

The data for this tag can be thought of as a table of 256 entries,
each containing one value for red, green, and blue. The first
triple is palette entry 0 and the last is palette entry 255.

This tag has been superceded by DFTAG_LUT.

DFTAG_RI8

Raster image-8
xdim*ydim bytes (where xdim and ydim are the dimensions given by
the corresponding DFTAG_ID8.)
202 (0X00CA)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
image data     2-d array of 8-bit integers

The data for this tag is a row-wise representation of the
elementary 8-bit image data. The data is stored width-first (hence
row-wise) and is 8 bits per pixel. The first byte of data
represents the pixel in the upper-left hand corner of the image.

This tag has been superceded by DFTAG_RI.

DFTAG_CI8

Compressed image-8
? bytes
203 (0X00CB)

*** INSERT FIGURE HERE ***

<ref no>            reference number (16-bit integer)
<compressed image>  series of run-length encoded bytes

The data for this tag is a row-wise representation of the
elementary 8-bit image data. Each row is compressed using the
following run-length encoding where n is the lower seven bits of
the byte. The high bit represents whether the following n character
will be reproduced exactly (high bit-0) or whether the following
character will be reproduced n times (high bit=1) . Since DFTAG_CI8
and DFTAG_Rl8 are basically interchangeable, it is suggested that
you not have a DFTAG_CI8 and a DFTAG_RI8 that have the same
reference number.

This tag has been superceded by DFTAG_RLE.

DFTAG_II8

IMCOMP image-8
? bytes
204 (0X00CC)

*** INSERT FIGURE HERE ***

The data for this tag is a 4:1 compressed 8-bit image, using the
IMCOMP compression scheme.

This tag has been superceded by DFTAG_IMC.

DFTAG_SDG

Scientific data group
n*4 bytes (where n is the number of data objects in the group.)
700 (0x02BC)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)
<tag n>        tag number of nth member of the group (16-bit
               integer)
<ref n>        reference number of nth member of the group (16-bit
               integer)

The scientific data group (SDG) data is a list of data identifiers
(tag/ref pairs) that describe a scientific dataset. All of the
members of the group provide information for correctly interpreting
and displaying the data. Application programs that deal with SDGs
should read all of the elements of a SDG and process those
identifiers which it can use. Even if an application cannot process
all of the tags, the tags that it can understand will be usable.

Tag types that may appear in a DFTAG-SDG are listed in Table 6.7.

Table 6.7  Possible Tag Types in an SDG

Tag            Description

DFTAG_SDD      scientific data dimension record (rank and dimensions)
DFTAG_SD       scientific data
DFTAG_SDS      scales
DFTAG_SDL      labels
DFTAG_SDU      units
DFTAG_SDF      formats
DFTAG_SDM      maximum and minimum values
DFTAG_SDC      coordinate system
DFTAG_SDT      transposition (obsolete)
DFTAG_SDLNK    link to new DFTAG_NDG (see Sets and Groups)

Example

DFTAG_SDD, DFTAG_SD, DFTAG_SDM

A dimension record, the scientific data, and the maximum and
minimum values of the data go together. The application reads the
rank and dimensions from the dimension record, then reads the data
array with those dimensions. If it needs maximum and minimum, it
also reads them.

This tag has been superceded by DFTAG_NDG.

See also: Sets and Groups

DFTAG_SDT

Scientific data transpose
0 bytes
709 (0x02C5)

*** INSERT FIGURE HERE ***

<ref no>       reference number (16-bit integer)

The presence of this tag in a group indicates that the data pointed
to by the corresponding DFTAG_SD is in column-major order, instead
of the default row-major order. No data is associated with this
tag.

This tag will no longer be written by the HDF library, but if it
is encountered in an old file it will be interpreted as originally
intended.


Chapter 7 Making HDF Portable

     Chapter Overview
     The HDF Environment
          Machines Supported
          Language Standards
     Organization of Source Files
          Header Files
          Source Code Files
     Passing Strings Between FORTRAN and C
          Passing Strings from FORTRAN to C
          Passing Strings from C to FORTRAN
     Function Return Values between FORTRAN and C
          Differences in Acceptable Routine Names
          Case Sensitivity
          How HDF Deals with "All-Upper Case" Compilers 
	  Appended Underscore
          How HDF Specifies the Appended (and Prepended) Underscore
          Short Names vs. Long Names
          ANSI C vs. Old C
          Type Differences
          Size Differences
          Number Representation
          Byte-order and Structure Representations
          Access to Library Functions


Chapter Overview

The NCSA implementation of HDF is accessible to both C and FORTRAN
programs and is implemented on many different machines and several
operating systems. There are important differences between C and
FORTRAN, as well as between different implementations of each
language, especially FORTRAN. There are also important differences
between the different machines and operating systems that HDF
supports. This chapter describes many of these differences,
problems and issues associated with them, and methods employed in
the HDF source code to deal with them.


The HDF Environment

The list of machines and operating systems on which HDF is
implemented is steadily growing. For reasons that should soon be
clear, the number of platforms on which HDF is officially supported
is growing slowly. Every time a new platform is added to the list
of those that HDF supports, additional code must be written that
takes into account the way memory is organized, the way the
operating system works, the way numbers are represented, the way
the file system works, and the way FORTRAN and C works on that
system.


Machines Supported

As of this writing, the following platforms are supported by NCSA's
HDF group:

          Cray X-MP and Cray 2 (UNICOS)
          Sun Systems' Sun 3, Sun 386, and Sparcstation (Unix)
          Convex (Unix),
          Macintosh (MPW Shell)
          IBM PC (MS-DOS)
          Silicon Graphics (Unix)
          Vax (VMS)
          HP 9000 (HPUX)
          DecStation (Ultrix)
          IBM RT (Unix)

In addition to these platforms, HDF has been ported to many other
platforms for which support cannot currently be provided. These
include Alliant, Apollo (Domain), HP 3000, Stellar, Amiga,
Symbolics, NeXT, and IBM 3090 (MVS).


Language Standards

Unfortunately, not all compilers are the same. FORTRAN compilers
often differ in the ways they pass parameters, in the identifier
naming conventions they employ, and in the number types that they
support. Similarly, though generally not as drastically, compilers
differ in the number types that they support and in their adherence
to the ANSI C standard.

In order to keep these differences to a minimum, the primary
dialects used for the source code in the NCSA implementation of RDF
FORTRAN 77, ANSI C, and "old style C"(1), hereafter referred to as
"old C". There are very few platforms whose C and FORTRAN compilers
do not adhere to at least one of these standards. When time and
resources permit, attempts are also made to support features or
variations in other dialects of C and FORTRAN, particularly on
those platforms that are important to NCSA users. Much of the
remainder of this Chapter speaks to these differences.

Follow these guidelines

To all future HDF developers, we cannot overstress the importance
of following the guidelines outlined in this Chapter. It may take
longer to write code, and it may be considerably more difficult to
adapt your coding style to that given here, but the long-term
benefits in terms of portability and maintenance costs are well
worth the effort.


Organization of Source Files

There are three types of files in the HDF source code directory:
header files, source code files, and a makefile. Header files and
source code files are organized by application area. All of the
functions that apply to a particular application area are stored
in three source files, and all definitions and declarations that
apply are stored in a corresponding header file. The makefile
describes the dependencies among the source and header files, and
also provides the commands required to compile the corresponding
libraries and utilities.


Header Files

There is one header file for each application area. The HDF Raster
Image Set interface, for example, has the header file dfr8.h. It
contains definitions and declarations that are unique to the
interface.

(1) "old style C" refers to the version of C described in the first
edition of The C Programming Language, by Brian Kernighan and
Dennis Ritchie, published by Prentice-Hall.

Other header files include:

               hdf.h
               hdfi.h
               hproto.h
               constants.f
               functions.f

hdf.h and hdfi.h.(1) The file hdf.h contains declarations and
definitions for the common data structures used throughout HDF,
definitions of the HDF tags, definitions of error numbers, and
definitions and declarations specific to the general purpose
interface. Since hdf.h depends on hdfi.h, it includes (via
#include) hdfi.h.

The file hdfi.h contains a large amount of information specific to
the various computing environments supported by HDF. Those
environmental parameters that need to be set to particular values
when compiling the HDF library are contained in hdfi.h. Machine
dependent definitions of such things as number types and macros
for reading and writing numbers are also included in hdfi.h.

When porting HDF to a new system, only hdfi.h and the Makefile need
to be modified.

Normally it is a good idea to include hdf.h (and therefore
indirectly hdfi.h) in user programs, though users usually need not
be aware of their contents.

hproto.h. This file contains ANSI C prototypes for all HDF C
routines, and must be include in ANSI-conforming C programs that
make calls to HDF routines.

constants.f. This file is for use in FORTRAN programs. It contains
important constants, such as tag values, that are defined in hdf.h.
Systems that have FORTRAN preprocessors might be able to include
these files via #include statements or their equivalent.

functions.f. This file is for use in FORTRAN programs. It contains
declarations of all HDF FORTRAN-callable functions. Systems that
have FORTRAN preprocessors might be able to include these files via
#include statements or their equivalent.


Source Code Files

All HDF operations are performed by routines written in C. Hence,
even FORTRAN calls to HDF result in calls to the corresponding C
routines. However, because of the problems described below the
relationships between the C routines and the corresponding FORTRAN
routines can be very confusing. Before looking at the specific
problems, we first describe the C and FORTRAN source file
organization.

(1)In earlier implementations of HDF, these files were called df.h
and dfi.h. Starting with HDF 3.2 the general purpose layer of HDF
was completely rewritten, and all routine names changed from "df
... " to "h ...".

Each HDF interface typically has four files associated with it. The
HDF Raster Image Set interface, for example, has four associated
source files: dfr8.h, dfr8.c, dfr8f.c, dfr8ff.f. The suffixes on
the filenames indicate their functions, as we describe next.

The ".h" file is the header file. The other three files, which
contain the C and FORTRAN functions, are:

(1)  The "normal" C routines. These routines do all of the actual
     HDF work. The others have the job of transferring control and
     data from a FORTRAN environment to a C environment.

   These routines are stored in files whose names end with ".c",
   as in "dfr8.c". Every call to HDF, whether it is a C call or a
   FORTRAN call, ultimately results in a call to one of these
   routines.

(2)  C routines that are compatible with FORTRAN and therefore
     directly callable from FORTRAN.  The primary function of these
     routines is to provide recognizable function names to the
     linker. They may also perform operations on data they receive
     from the FORTRAN routines that call them, such as transferring
     a FORTRAN string to a local C data area. Examples of how they
     perform these operations are given below.

  These routines are stored in files whose names end with "f.c",
  as in "dfr8f.c" for the raster image interface. The "f" means
  that the routines are meant to be called from FORTRAN; the "c"
  means that they are C source code.

(3)  FORTRAN routines that perform some operation on the parameters
     that C is unable to perform, before and/or after calling the
     corresponding C routine. These routines are required, for
     example, when one of the parameters is a string. The
     corresponding C routine has no way of knowing the length of
     the string unless it is explicitly given the length by the
     FORTRAN routine.

  These routines are stored in files whose names end with "ff.f",
  as in "dfr8ff.f" for the raster image interface. The "f' means
  that the routines are to be called from FORTRAN; the first "f"
  means that they perform some FORTRAN operation that C cannot
  perform; the second "f" means that they are FORTRAN source code.

The roles of these different types of source file types will become
clearer as we look at some of the problems that arise in
interfacing C and many different implementations of FORTRAN.


File naming conventions

The naming conventions for HDF library source code files are
complicated by several factors. Because of the wide variety of
platforms which HDF must accommodate, all files that will compile
to object modules in the HDF library must have names that are
unique in the first 8 characters, ignoring case. The difficulties
involved in maintaining a Fortran-callable interface to a library
that is primarily written in C further complicate the naming of
source code files.


Passing Strings between FORTRAN and C

One of the most important differences between FORTRAN and C
compilers is in the way strings are represented. Different
compilers use different data structures for strings, and supply
string length information in different ways.


Passing Strings from FORTRAN to C

When strings are passed between FORTRAN and C routines, they may
need to be converted from one representation to the other. C
compilers store strings in an array of type char, terminated by a
NULL byte ('\0'). The name of a string variable is a pointer to the
address of the first character in the string. FORTRAN compilers are
not consistent in the ways that they store strings.

Two pieces of information are needed in order to pass a string from
FORTRAN to C: its length and its address.

The first problem is solved by invoking the standard FORTRAN
function len(), which returns the length of a string. Since C
expects a '\0' (NULL) byte at the end of strings, care must be
taken that this NULL byte does not overwrite useful information in
the FORTRAN string.

The second problem is more difficult because of the different ways
that different FORTRANs store string.

To solve this, a macro_fcdtocp ("FORTRAN character descriptor to
C pointer) is used. _fcdtocp is defined differently, depending on
the machine on which it is compiled. Here are some different ways
that _fcdtocp works:

There are three different ways that a FORTRAN string's address can
be passed to C:

* UNICOS FORTRAN stores strings in a structure called '_fcd"
  (FORTRAN character descriptor). '_fcdtocp' is a built-in function
  in UNICOS that returns the address of the string.

* VMS FORTRAN stores strings by means of a string descriptor
  structure that provides information about where the string is
  stored and its length. When compiled under VMS, the function
  _fcdtocp extracts the string's address and returns that value.

* Most other FORTRAN compilers supported by HDF store strings just
  as C does, in character arrays with the array name identifying
  the array's address. For these compilers nothing special need be
  done in passing a string from FORTRAN to C.

In HDF, a FORTRAN call that involves passing a string results in
the following sequences of actions:

(1) A FORTRAN "stub" determines the length and address in memory
    of the string. Since this is a FORTRAN routine, it can be found
    in the file.

(2) The FORTRAN stub then calls a C routine, which it passes all
    parameters from the initial call, plus one extra parameter: the
    string's length.

(3) The C routine converts the FORTRAN string to a C string by
    copying it to a C array of type char, and appending a '\0'
    byte. Since this C routine serves as a link between a FORTRAN
    stub and the corresponding C interface call, it can be found
    in the " ... f.c" file.

(4) This C routine then calls the HDF C routine that performs the
    actual function.

This process is illustrated in Figure 7.1

*** INSERT FIGURE HERE ***


Passing Strings from C to FORTRAN

When strings are passed from C to FORTRAN, the reverse procedure
is followed. First, a string pointer is obtained within the FORTRAN
routine's data area. (It is assumed that the space pointed to has
already been allocated, and is sufficiently large to hold the
string.) The string is then copied from the C data area to the
FORTRAN data area. Finally, if necessary the FORTRAN string's data
area is padded with blanks.


Function Return Values between FORTRAN and C

When a FORTRAN routine calls a C function, it always expects a
return value from that function. Unfortunately, the form in which
C functions return arguments is not always compatible with the form
in with FORTRAN expects them.

To solve this problem, some C compilers offer the option of
controlling the form of the return value from a function. For
example, Language Systems FORTRAN for the Macintosh requires that
all C function declarations be prepended by the word "pascal" so
that the return value can be recognized by a FORTRAN routine that
calls it, as in:

pascal int dsgrang(void *pmax, void *pmin)

Since C always expects return values to be passed "by value" rather
than, say, "by reference," it is important to coerce FORTRAN
functions to do the same. This is accomplished by defining a macro
FRETVAL that is prepended to the declaration of every FORTRAN-
callable C function. For example:

          FRETVAL (int)
dsgrang(void *pmax, void *pmin)

If Language Systems FORTRAN is to be used, then FRETVAL is defined
(in hdfi.h) as follows:

#if defined(MAC)           /* with LS FORTRAN */
#  define FRETVAL(X)       pascal x
#endif


Differences in Acceptable Routine Names

Different FORTRAN compilers impose different restrictions on the
length, character set, and form of identifiers. In general, HDF
uses C conventions for naming routines, and this means that
measures must be taken to accommodate those compilers which have
different conventions than C.

The method used in HDF is to name routines differently, depending
on the particular conventions of the FORTRAN compiler being used.
This is done by defining certain flags for the preprocessor via
#define statements in the hdfi.h file. Then conditional
compilation--via #ifdef statements in.the source code files--is
used to compile the routines that are called from FORTRAN with
names that that particular FORTRAN can understand.


Case Sensitivity

C compilers are case sensitive. That is, upper and lower case
letters are different. Many FORTRAN compilers allow users to use
upper and lower case letters in naming routines, but the symbol
table names that they produce in object modules are all in upper
case or all in lower case. These compilers are not case sensitive.
If routines compiled by a case-sensitive compiler are to be linked
with routines compiled by a compiler that is not case sensitive,
they might not recognize one another's routines.

For example, the UNICOS FORTRAN compiler allows you to name
routines without regard to case, but produces object modules with
all routine names converted to upper case. UNICOS C, on the other
hand, performs no such conversion. Consider how the HDF routine
Hopen is treated by the two compilers.

Hopen is written in C, so the HDF library has the name 'Hopen', a
mixed-case name, in its symbol table. Suppose you make the
following call in your UNICOS FORTRAN program:

file_id = Hopen('myfile', ... )

The FORTRAN compiler will create an object module with the routine
name "HOPEN" (all upper case) in its symbol table. When you link
it to the HDF library, it will find "Hopen", but not "HOPEN", and
will generate an "unsatisfied external reference" error.

So far there are three FORTRAN compilers supported by HDF that
convert names to upper case in the symbol table:

     VMS FORTRAN
     UNICOS FORTRAN
     Language Systems FORTRAN.


How HDF Deals with "All-Upper Case" Compilers

The solution to this problem is to name C functions entirely in
upper case whenever they are called by all-upper case FORTRAN
routines. This is done as follows: For FORTRAN compilers that
produce all upper case symbol table entries a flag "DF_CAPFNAMES"
is defined via a #define in hdfi.h. Then conditional compilation
is used in the source code files to compile the routines that are
called from FORTRAN with all-upper case names.

For example, since UNICOS FORTRAN produces all-upper case symbol
table entries, there is in the UNICOS section of hdfi.h. the
following line:

#define DF_CAPFNAMES

Correspondingly, there are conditional compilations in the "..f.c"
files that produce all-upper case routine names. For example, the
function name "Fun" can be redefined at "FUN" as follows:

#ifdef DF_CAPFNAMES
    define Fun  FUN
    #endif /* DF_CAPFNAMES */


Appended Underscore

A similar problem occurs with respect to the underscore character.
When compilers generate object module symbol tables from source
code, they commonly prepend an underscore ('_') to all external
symbols. C generally does this. Then, when linking occurs, the
linker looks for external symbols in the symbol table with the
prefix.

Unfortunately, many FORTRAN compilers also append an underscore to
identify external symbols. Since C does not generally do this,
external references in FORTRAN-generated object modules will not
recognize externals with the same names in C-generated modules.

For example, the FORTRAN compiler on the CONVEX, places an
underscore at the end of routine names, while the C compiler only
places an underscore at the front. Consider how a C function called
FUN would be treated in this context.

Since FUN is a C function, the object module containing FUN has it
stored under the name "_FUN". Suppose you make the following call
in a FORTRAN program:

x = FUN (y)

The FORTRAN compiler creates an object module with the routine name
"_ FUN_" in its symbol table. When you link it to the C module, it
will find " FUN", but not "_FUN_", and will generate an
"unsatisified external reference error."


How HDF Specifies Appended (and Prepended) Underscores

The solution to this problem is to name C functions with an
appended underscore whenever one is expected by FORTRAN calling
routines. For instance, if the name of FUN had been "FUN_" in the
example, its name in the C object module would have been "_FUN_",
which is exactly what FORTRAN put into its symbol table.

This is done as follows: For every machine whose FORTRAN compiler
requires appended underscores, a flag "FNAME_POST_UNDERSCORE" is
defined via a #define in hdfi.h in the section associated with that
machine. Similarly, for those that require a prepended underscore
a flag "FNAME_PRE_UNDERSCORE" is defined. Then, in a section of
code in hdfi.h, conditional compilation is used to define a macro
called "FNAME" that appends and/or prepends underscores as
required.

In the modules in which routines are actually defined (including
in hptroto.h), the FNAME macro is then applied to each routine,
causing the appropriate underscores to be added.

Hence, in the example above in which "Fun" was caused to be
uppercase, the actual definition would be as follows:

#ifdef DF_CAPFNAMES
     define Fun     FNAME(FUN)
#endif /* DF_CAPFNAMES */


Short Names vs. Long Names

In the C implementations supported by HDF, identifiers may be any
length, with at least the first 31 characters having significance.
FORTRAN compilers differ in the maximum lengths of identifiers that
they allow, but all of those supported by HDF allow identifiers to
have at least seven characters.

To deal with the discrepancies between identifier lengths allowed
by C and those allowed by the various FORTRAN compilers, a set of
equivalent short names has been devised that can be used when
programming in FORTRAN. For all HDF routines with names that are
more than seven characters long, there is an identical routine
whose name is eight or fewer characters long.

For example, for the routine "DFSDgetdims" in the file dfsd.c there
is a corresponding routine "dsgdims" in the file dfsdff.f with
exactly the same functionality.


ANSI C vs. Old C

Both ANSI and old C compilers are supported in the current
implementation of HDF (HDF 3.2). ANSI C is preferred, because it
has many features that help insure portability, but unfortunately
many important platforms do not support full ANSI C. The HDF code
determines whether or not ANSI C is available from the flag _STDC_.
If ANSI C is available, then _STDC_ is defined.(1)

The most noticeable difference between ANSI and old C is in the way
functions are declared. For example, in ANSI C the function
DFSDsetdims() is declared with

int DFSDsetdims(intn rank, int32 dimsizes[])

(1) Some C compilers are not entirely ANSI-conforming, yet they
conform well enough that the HDF implementation can treat them as
if they were. In such cases, it is considered permissible to
"#define" _STDC_ when compiling.

In old C the same function is declared with

int DFSDsetdims(rank, dimsizes)
intn rank;
int32   dimsizes[];

The NCSA implementation of HDF accommodates these differences by
defining in hdfi.h a flag called PROTOTYPE, which is used for every
function declaration, as in the following example.

#ifdef PROTOTYPE
int DFSDsetdims(intn rank, int32 dimsizes[])
#else
int DFSDsetdims(rank, dimsizes)
intn rank;
int32   dimsizes[];
#endif /* PROTOTYPE */

Another big difference between K&R and ANSI C is that ANSI C allows
the use of function prototypes that include arguments, which helps
enormously in detecting errors in the number and types of
arguments. Old C also allows the use of function prototypes, but
without the argument list. This difference occurs whenever
PROTOTYPE is defined, it is handled by means of a macro called
PROTO, which is defined as follows:

#ifdef  PROTOTYPE
#define         PROTO(x) x
#else
#define         PROTO(x)  ()
#endif

This macro is applied as in the following example:

extern int32 Hopen
PROTO((char *path, intn access, int16 ndds));

When PROTOTYPE is defined, PROTO causes the argument list to stay
as it is. When PROTOTYPE is not defined, PROTO causes the argument
list to disappear.


Type Differences

Different machines and compilers differ in the sizes of numbers
that they assign to different data types, in their representations
of different number types, and in the way they organize aggregates
of numbers (especially structures).


Size differences

The same number type can be different sizes on different machines.
Type int, for example, is 16 bits to many IBM PC compilers, 48 bits
to some supercomputer compilers, and 32 bits on most others. These
differences can cause insidious problems in code like the HDF code
that depends in so many places on numbers being the right size.

This problem is handled in HDF by insisting in the code that all
variables and functions must use a typedef'ed type which fully
defines their type, including the number of bits that they occupy.
This includes all parameters, members of structures, and static,
automatic, and external variables.

Hence, the data types used in HDF include the following. (The
prefix "u" stands for "unsigned".)

          int8
          uint8
          int16
          uint16
          int32
          uint32
          float32
          float64
          intn
          uintn

So, for example, on Sun's C compiler uint32 is defined with

typedef long int int32;

Hence, for each machine, typedefs are declared that map all of the
data types used into the best available types.

Unfortunately, it is not always possible to find a local data type
that maps exactly to one of these types. For example, the Cray
UNICOS C compiler does not support a 16-bit data type. In such
instances, we do the best we can and try to be on the lookout for
potential problems with number sizes.

The data types "intn" and uintn are to be used whenever it can be
determined that number type size is of no consequence, and that a
16-bit integer is large enough to hold any value the number can
have. In such cases, the native int (or unsigned int) type of the
host machine is used. Experience has shown that substantial
performance gains can be achieved by using intn or uintn in certain
circumstances.


Number Representation

One of the keys to producing a portable file format is insuring
that numbers that are represented differently on different machines
somehow get converted correctly when moved from machine to machine.
The approach taken to this in the NCSA implementation is to provide
conversion routines to convert between local representations and
a standard representation that is stored in HDF files. Details of
this process will be included in a later edition of this manual.


Byte-order and Structure Representations

Even when the basic bit-representation of constants or aggregates
like structures is the same between machines, the ways that the
bits are packed into a word, and the order in which the bits are
layed out, can differ among machines. For example, Digital machines
and Intel-based machines generally order bytes differently from
most others. And the C compiler on a Cray, whose word size is 64
bits, packs structures differently from one on machines whose word
size is 32 bits.

Differences in byte order among machines are handled in two ways.
when the data to be written (or read) consists of non-integer data
and/or a large array or any type of data, a conversion routine
(mentioned in the previous section, "Number Representation") is
invoked. When an individual integer is to be written (or read), an
"ENCODE" or "DECODE" macro is used.

There are ENCODE and DECODE macros for 16-bit and 32-bit integers:

     INT16ENCODE
     UINT16ENCODE
     INT32ENCODE
     UINT32ENCODE
     INT16DECODE
     UINT16DECODE
     INT32DECODE
     UINT32DECODE

The ENCODE macros are written in such a way that they write
integers to an HDF file in a standard way, no matter what the
corresponding word-size and byte order are of the host machine.

Likewise, Tthe DECODE macros are written in such a way that they
read integers stored in a standard way in an HDF file and store the
integers in the required byte order and word size on the host
machine.

Since the ENCODE and DECODE macros deal with both byte order and
word size, they are also used to handle the reading and writing of
record-like structures. For example, the structure for an HDF data
descriptor consists of two 16-bit fields, followed by two 32-bit
fields, as implied by the following C declaration:

          struct {
               uint16 tag;
               uint16 ref;
               uint32 offset;
               uint32 length;
          }

In an HDF file this structure must occupy exactly 12 bytes. On one
computer it might occupy 12 bytes of storage, and on another, such
as the Cray, it might occupy 32 bytes. Furthermore some machines
might represent the numbers internally in different byte orders
than others. By using the ENCODE and DECODE macros we are able to
insure that these values are represented correctly in all machines
and in HDF files.


Access to Library Functions

Despite efforts to standardize them, function libraries often
differ in significant ways. There are at least three types of
functions that need special treatment in the HDF implementation:

(1) All file I/0 access. Both the stream and system level functions
    need this (i.e. the functions associated with the fopen() call,
    and the functions associated with the open() call). This is
    generally a 16-bit vs. 32-bit problem, because some machines
    use 16-bit values for the size of and the number of elements
    to write/read, and others use 32-bit values.

(2) All memory allocation and releasing. There are two different
    problems associated with this. The first is that on a 16-bit
    machine, a 16-bit value is used for the number of bytes to
    allocate at one time. The second is that certain operating
    systems (notably MS-Windows and MAC/OS) don't have malloc() and
    free() calls. These operating systems use handles for
    allocating memory and require different function calls for
    memory allocation.

(3) Memory and string manipulation. These functions (such as
    memcpy(), memcmp(), strcpy(), strlen(), etc.) require slightly
    different function names under different memory models in
    MS-DOS and under MS-Windows than on most other systems.

These differences are dealt with by defining macros for the
relevant functions, and defining them appropriately in the
machine-specific sections of hdfi.h.