1.1	NCSA HDF Calling Interfaces and Utilities

NCSA HDF Basics	1.1

National Center for Supercomputing Applications

March 1993

                                                                
1.1	NCSA HDF Calling Interfaces and Utilities

NCSA HDF Basics	1.1

National Center for Supercomputing Applications

March 1993

                                                                
Chapter 1	NCSA HDF Basics


Chapter Overview 
What Is Hierarchical Data Format?
Why Was HDF Created?
NCSA HDF Application Software
NCSA Scientific Visualization Software
HDF Calling Interfaces
HDF Utilities
Getting Started with HDF
Examples
Writing an HDF 8-Bit Raster Image Set
Writing an HDF Scientific Dataset 
FORTRAN and C
FORTRAN Stubs
Atomic Data Type Specifications
Array Specifications
Case Sensitivity
Name Length
Header Files
FORTRAN 77, ANSI C and K & RÕs C
HDF Without FORTRAN
Installing HDF
Transferring HDF Files
How to Get HDF
FTP
Archive Server
U.S. Mail


Chapter Overview

This chapter provides a description of NCSA HDF, the reasons 
behind its creation, and a brief description of HDF application 
software.


What Is Hierarchical Data Format?

The Hierarchical Data Format (HDF) is a multi-object file 
structure that is designed to facilitate the sharing of data among 
people, projects, and machines on a network (Fig. 1.1). HDF was 
created at the National Center for Supercomputing Applications 
(NCSA) to serve the needs of diverse groups of scientists working 
on supercomputing projects of many kinds. 

ED. NOTE:  Figures are not available in this plain text
version of the specification.

Figure 1.1 	HDF: A File Format for Scientific Data in a Distributed 
Environment

                                                        
Why Was HDF Created?

Scientists commonly generate and process data files on several 
different machines, use various software packages to process files, 
and share data files with others who use different machines and 
software. Also, the mixture of information that scientists need to 
work with often varies from one file to another, even for the same 
application. Files may be conceptually related but physically 
separated;  e.g., some data may be dispersed among different files, 
some in program code, and some in the minds of various users.

HDF addresses these problems by providing a general purpose file 
structure that does the following:

*  Makes it possible for programs to obtain information about the 
data in a file from the file itself, rather than from another 
source

*  Lets you store different mixtures of data and related 
information in different files, even when the files are processed 
by the same application program

*  Standardizes the formats and descriptions of many types of 
commonly used datasets, such as raster images and scientific 
data

*  Encourages the use of a common data format by all machines and 
programs that produce files containing a specific dataset

*  Can be adapted to accommodate virtually any kind of data by 
defining new tags or new combinations of tags

HDF files are self-describing. For each data object in an HDF file, 
there are  predefined tags that identify such information as the 
type of data, the amount of data, its dimensions, and its location in 
the file.

The self-describing capability of HDF files has important 
implications for processing scientific data. It makes it possible to 
fully understand the structure and contents of a file just from the 
information stored in the file itself. A program that has been 
written to interpret certain tag types can scan a file containing 
those tag types and process the corresponding data. Self-
description also means that many types of data can be bundled in 
an HDF file. For example, it is possible to accommodate symbolic, 
numerical, and graphical data in one HDF file.

Related items of information about a particular type of data are 
grouped into sets, such as the raster image sets (see Chapter 2, 
ÒStoring Raster ImagesÓ) and scientific datasets (see Chapter 4, 
ÒStoring Rectangular Gridded Arrays of Scientific DataÓ). Each set 
defines an application area supported by HDF. Additional sets can 
be defined and added to HDF as the needs arise.

Figure 1.2 shows a conceptual view of an HDF file containing a 
scientific dataset. The actual two-dimensional array of data is 
only one element in the set. Other elements include the number of 
dimensions (rank), the sizes of the dimensions, identifying 
information about the data and axes, and scales (ranges) for the 
axes.

Figure 1.2 	HDF File with Scientific Dataset

                                                             
NCSA HDF Application Software

NCSA HDF application software currently comes in three forms: 
(1) NCSA scientific visualization tools that read and write HDF 
files, (2) calling interfaces that let you read and write HDF files 
from within a FORTRAN or C program, and (3) command-line 
utilities that operate directly on HDF files. 

The integration of these types of software in the computing 
environment at NCSA is illustrated in Fig 1.3. Visualization tools 
such as NCSA Collage and NCSA DataScope read and write HDF 
files. Calling interfaces in the HDF library (libdf.a) let you read 
and write HDF files from your programs. And utilities such as 
r8tohdf let you operate on HDF files at the command level.

Figure 1. 3	HDF Software in an Integrated Computing Environment

                                                          
NCSA Scientific Visualization Software

The use of HDF files guarantees the interoperability of the 
scientific visualization tools at NCSA. Some tools operate on raster 
images, some operate on color palettes, some use images, color 
palettes, data and annotations, and so forth. HDF provides the 
range of data types that these tools need, in a format that lets 
different tools with different data requirements operate on the 
same files without confusion.


HDF Calling Interfaces

In order to minimize the amount of knowledge you need to have 
about HDF, calling interfaces have been developed for specific 
types of applications, such as the storage and display of raster 
images or scientific data archiving. A calling interface is a 
library of routines that can be called from an application program 
for storing and retrieving information, including raw data, from a 
particular type of HDF file.

Different applications typically require different interfaces. 
Consequently, NCSA HDF provides FORTRAN and C calling 
interfaces for storing and retrieving 8- and 24-bit raster images, 
palettes, scientific data, and annotations. These interfaces, which 
are described in detail in chapters 2 through 5, are mutually 
compatible, and user programs can combine calls to routines in 
different interfaces when they need to store different kinds of 
data in the same file. 

HDF files tend to be used on several different machines, and HDF 
interfaces developed at NCSA are implemented on as many 
machines as possible. An important goal in the development of 
NCSA HDF user interfaces is to eliminate the necessity of 
changing program code when moving an application from one 
machine to another.


HDF Utilities

The HDF command line utilities are application programs that can 
be executed by entering them at the command level, just like other 
UNIX commands. They make it possible for you to perform, at the 
command level, common operations on HDF files for which you 
would normally have to write your own program. For example, the 
utility r8tohdf is a program that takes a raw raster image from a 
file and stores it in an HDF files in a raster image set.

The HDF utilities provide capabilities for doing things with HDF 
files that would be very difficult to do under your own program 
control. For example, the utility hdfls lists the contents of an HDF 
file, including such things as the meanings of tags and the size and 
location of each data item.

The HDF utilities are described in detail in the Chapter called
NCSA HDF Command Line Utilities.


Getting Started with HDF

If you do not have access to a machine that already has an HDF 
library, you will need to install it yourself or have your system 
administrator install it. Although procedures for installing the 
HDF library vary from one system to another, the basic steps are 
the same in all cases.

First, you need to get the software.  The section, ÒHow to Get HDFÓ 
tells you how to get the HDF software from NCSA.  In some cases 
you can get the actual precompiled library, in which cases you can 
load it into your machine into an appropriate directory. If instead 
you get the source code for the HDF library, you first have to 
compile the source code into a linkable library.

Detailed information on how to install and use HDF on specific 
systems can be found in the documentation that comes with the 
system-specific versions of the software.


Examples

Writing an HDF 8-Bit Raster Image Set

A typical use of HDF involves preparation of scientific data for 
visualization as an 8-bit raster image. Using 8-bit raster imaging, 
the values on a grid of numbers can be represented by color values 
in a palette of 256 colors. 

The following code segments, the first in FORTRAN, the second in 
C, convert a 200 x 100 floating-point array to an 8-bit raster 
image, then store the image in an HDF file.

FORTRAN:
      CHARACTER*1 image(100,200)
      INTEGER istat, d8aimg

C      Other Fortran code goes here

C      Convert values in array ivals to character (8-bit) data
      do 10 ix=1,100
         do 10 iy=1,200
            image(ix,iy) = char(ivals(ix,iy))
   10 continue

C      Write image to an HDF file
      istat = d8aimg('myfile.hdf', image, 100, 200, 0)
      if (istat .ne. 0) then
         write(*,*) 'Error writing HDF file'
      endif

C:
char image[200][100];
int ix, iy, istat, DFR8addimage();

/* Other C code goes here */

for (ix=0; ix<200; ix++)
  for (iy=0; iy<100; iy++)
    image[ix][iy] = (char) (ivals[ix][iy]);

istat = DFR8addimage("myfile.hdf", image, 200, 100, 0);
if (istat != 0)
  printf("Error writing HDF file\n");


NOTE:  DFR8addimage writes the image stored in the array image to 
myfile.hdf. If myfile.hdf exists, the image is appended to the file. 
If myfile.hdf does not exist, a new file is created, and the image is 
written as the first image in the file. The variable istat is 
assigned the value 0 if DFR8addimage succeeds; , -1 is assigned if it 
fails. 

The routine DFR8getimage is available for retrieving images from 
HDF files. Routines are also available for storing color lookup 
tables (palettes) with raster images.


Linking to the HDF library

If your FORTRAN or C program makes a call to HDF, it must be 
linked to the HDF library.  Normally the HDF library is in a file 
called libdf.a. You can indicate the linkage in your compile 
statement, or if a separate linkage step is used, it may be done at 
that time.  Hence, if libdf.a is in your current directory, you 
compile and link your program to HDF with a statement such as the 
following.

f77 -o myprog myprog.f libdf.a     (FORTRAN) 
cc -o myprog myprog.c libdf.a       (C)

If your library is contained in a public library directory on a 
UNIX system, you can link it with a statement such as

f77 -o myprog myprog.f -ldf      (FORTRAN) 
cc -o myprog myprog.c -ldf       (C)


Writing an HDF Scientific Dataset 

An HDF scientific dataset (SDS) is a collection in an HDF file of 
information about scientific data stored as a multi-dimensional 
array of numbers. Each SDS must include the actual data array, its 
rank (number of dimensions), and its dimensions. Optionally, an 
SDS can also contain scales to be used along the different axes 
when interpreting the data, maximum and minimum values of the 
data, calibration information, and the coordinate system used to 
interpret the data. Labels, units, and format specifications for 
displaying and interpreting the data and dimensions may also be 
included.

Below is code presented first in FORTRAN, then in C, that stores a 
200 x 200 16-bit integer array called "pressure" in an SDS in the 
HDF file, Ex.hdf. It also stores labels, units, and formats as part of 
the same SDS.

FORTRAN:
      INTEGER		dssdims, dssnt, dssdast, dssdist, dsadata
      integer*2	pressure(200,200)
      INTEGER		shape(2), ret, DFNT_INT16
      
      DFNT_INT16 = 22
      shape(1) = 200
      shape(2) = 200
      
      ret = dssnt(DFNT_INT16)
      ret = dssdims(2, shape)
      ret = dssdast('pressure 1','Pascals','E15.9','cartesian')
      ret = dssdist(1,'x','cm','F10.2')
      ret = dssdist(2,'y','cm','F10.2')
      ret = dsadata('Ex.hdf', 2, shape, pressure)  

C:
#include "hdf.h"
int16	pressure[200][200];
int	shape[2];

shape[1] = 200;
shape[2] = 200;

DFSDsetNT(DFNT_INT16);
DFSDsetdims(2, shape);
DFSDsetdatastrs("pressure 1", "Pascals","E15.9","cartesian");
DFSDsetdimstrs(1,"x","cm","F10.2");
DFSDsetdimstrs(2,"y","cm","F10.2");
DFSDadddata("Ex.hdf", 2, shape, pressure);  


NOTE:  The "set" calls (DFSDsetdims(), etc.) indicate the ancillary 
information that is to be stored with the SDS. DFSDsetdims is 
required; the others are optional.  DFSDsetNT indicates that the 
number type of the data is 16-bit integer. DFSDadddata writes the 
scientific dataset data to Ex.hdf. If Ex.hdf exists, the SDS is 
appended to the file. If Ex.hdf does not exist, a new file is created, 
and the SDS is written as the first in the file.


FORTRAN and C Language Issues

The necessity to support both FORTRAN and C interfaces for HDF 
inevitably leads to some difficulties in the design of the 
interfaces. What is natural to a C interface can be quite unnatural 
to a FORTRAN interface and vice versa. In order to make the 
FORTRAN and C versions of each routine as identical as possible, 
some compromises have often had to be made in the simplification 
of one or the other routine.


FORTRAN Stubs

Almost all of the actual code underlying the HDF interfaces is 
written in C. Every call to a FORTRAN routine ultimately makes 
access to a C routine that actually carries out the prescribed 
function. So, the FORTRAN routines might better be referred to as 
FORTRAN stubs rather than FORTRAN functions. When called, 
these stubs typically translate all parameter values immediately 
to a data type that is accessible to C, then call a corresponding C 
function to do the actual work.


Atomic Data Type Specifications

When mixing machines, compilers, and languages, it is difficult to 
keep straight differences in data types.  For instance "integer" 
might be a 32-bit quantity on one machine, a 16-bit quantity on 
another, and 64-bits on a third.  

Differences between FORTRAN and C also leads to some 
difficulties in describing some of the data types in the argument 
lists of HDF routines.  The help keep matters straight, special 
names have been given to all data types used in HDF routines.  The 
following table shows the names used in all descriptions of C and 
FORTRAN routines for the data types used. 

		C	FORTRAN
type of data	name	name

8-bit signed integer	int8	CHARACTER*1
8-bit unsigned integer	uint8	(not supported)
16-bit signed integer	int16	CHARACTER*2
16-bit unsigned integer	uint16	(not supported)
32-bit signed integer	int32	INTEGER*4
32-bit unsigned integer	uint32	(not supported)
32-bit floating point 	number	float32	REAL*4
64-bit floating point 	number	float64	REAL*8
generic integer	intn	INTEGER

The data types marked "NA" in the FORTRAN column are not 
generally available in FORTRAN, so no convention is indicated.

In most cases, it is obvious how you should declare variables in 
you program to conform to these data types.  For example a 
"uint16" in C would normally be "unsigned short."  But in some 
cases the correspondence may be a little more difficult.  When in 
doubt, consult the file hdfi.h, which contains type definitions for all 
of these types on all machines that HDF supports.  To automatically 
define these data types, C programmers can use "#include" to include 
hdfi.h with their program.

Array Specifications

Some of the routines covered in this manual place no restrictions 
on the rank (number of dimensions) that a data array can have. 
This is perfectly legal in C, but unnatural in FORTRAN. 
Fortunately, since both C and FORTRAN pass arrays by reference, 
no problem arises in the actual interface between the FORTRAN 
calls and the corresponding stubs. The only real problem is in the 
notation used in this manual to describe the routines as if they 
were actual FORTRAN routines.

As a result, in the declarations contained in the headers of 
FORTRAN functions, we use the following conventions:

*  CHARACTER*1    x(*) 
means that x refers to an array that contains an indefinite 
number of characters. It is the responsibility of the calling 
program to allocate enough space to hold whatever data is stored 
in the array.

*  REAL    x(*) 
means that x refers to an array of reals of indefinite size and of 
indefinite rank. It is the responsibility of the calling program 
to allocate an actual array with the correct number of 
dimensions and dimension sizes.


Case Sensitivity

Another difference between FORTRAN and C is that FORTRAN 
identifiers, in general, are not case sensitive, whereas C 
identifiers are. Although all of the FORTRAN routines shown in 
this manual are written in lower case, FORTRAN programs that 
call them can use either upper or lower case without loss of 
meaning.


Name Length

Since some FORTRAN compilers can only interpret identifier 
names with seven or fewer characters, the names of the FORTRAN 
routines have been restricted to seven or fewer characters.


Header Files

If your program uses special HDF declarations or definitions, you 
may need to include a header file. The primary header file is 
hdf.h.  It contains declarations and definitions that are used by 
the C routines.  For example, prototypes for all C routines can be 
automatically included by including hdf.h.  Also, if your program 
uses mnemonics for tags, the corresponding numerical values for 
the tags can be found in hdf.h.

There is also a file called constants.f that contains FORTRAN 
parameter statements that declare the HDF constants that you are 
most likely to use in a FORTRAN program that invokes HDF 
routines.

The inclusion of header files is not, in general, permitted by 
FORTRAN compilers. It is, however, sometimes available as an 
option. On UNIX systems, for example, the macro processors m4 and 
cpp let your compiler include and preprocess header files. If this 
or a similar capability is not available, you may have to copy 
whatever declarations, definitions, or values you need from 
constants.f into your program code.


FORTRAN 77, ANSI C, and K&R C

As much as possible, we have tried to stick closely to those 
implementations of the two languages that are in most common use 
today, namely FORTRAN 77, ANSI C and K&R C.   If your FORTRAN 
or C compiler understands FORTRAN 77, ANSI C, or K&R C,  it 
should be able to link easily to the interfaces. 

Although we try to adhere to these standards, we must note that a 
primary objective of the HDF project is to support HDF on a 
variety of different machines, and this, in some cases, means 
accommodating some deviations. We are also aware of the fact that 
many potential users of HDF have compilers that our code does not 
accommodate. Let us know if your particular dialect does not work 
with HDF. We may or we may not be able to help.


HDF Without FORTRAN

If you do not use FORTRAN with HDF, you may want  to compile the 
HDF library without any FORTRAN routines in it. Two instances in 
which you may choose to do so are the following:

1)  If you want to reduce the size of the HDF library

2)  If you donÕt have a FORTRAN compiler to use in compiling 
the library

Details on how to compile HDF without FORTRAN are contained in 
the INSTALL file that is included with the anonymous ftp version 
of HDF.


Installing HDF

Details on how to install HDF are beyond the scope of this manual, 
but you can get quite a bit of information about the process from 
the readme files that come with the source code. See the section 
below "How to Get HDF" for information on how to get the  actual 
HDF software.


Transferring HDF Files

HDF files are binary files, so any transfer protocol that transfers 
binary files without changing them can be used to transfer HDF 
files. 

Many HDF users use FTP to transfer HDF files. If you use FTP, 
switch to binary mode when transferring HDF files. 

If you use NCSA Telnet and you wish to transfer an HDF file to or 
from a Macintosh, you must pay special attention to whether or not 
to enable the "Macbinary" option. There are two case to consider:

*  If the HDF file is not from a Macintosh application (e.g., it is a 
normal HDF file generated by your FORTRAN or C program), 
then be sure to turn Macbinary mode off before performing the 
transfer.

*  If the HDF file corresponds to a Macintosh application (e.g., 
NCSA Layout, NCSA DataScope, etc.), and you want to transfer it 
so that it can be accessed from a Macintosh application on 
another Mac, then be sure to turn Macbinary mode on before 
performing the transfer.

How to Get HDF

You may obtain NCSA software via FTP, an archive server, or U.S. 
mail. Instructions for doing so are provided below.


FTP

If you are connected to Internet (NSFNET, ARPANET, MILNET, etc.) 
you may download NCSA HDF software, documentation, and source 
code, at no charge from an anonymous file transfer protocol (FTP) 
server at NCSA. The procedure you should follow to do so is 
presented below. If you have any questions regarding this 
procedure or whether you are connected to Internet, consult your 
local system administration or network expert.

1.  Log on to a host at your site that is connected to the Internet 
and is running software supporting the FTP command.

2.  Invoke FTP on most systems by entering the Internet address 
of the server:

	ftp  ftp.ncsa.uiuc.edu

	or

	ftp  141.142.20.50

3.  Log in by entering anonymous for the name.

4.  Enter your e-mail address for the password.

5.  Enter get README.FIRST to transfer the instructions 
(ASCII) to your local host.

6.  Enter quit to exit FTP and return to your local host.

7.  Review the README.FIRST file for complete instructions 
concerning the organization of the FTP directories and the 
procedure you should follow to download the README files 
that contain further information on how to get and compile 
the most recently released version of HDF for your machine 
and operating system and to determine which files to 
transfer to your home machine.

Your login session should resemble the sample presented below, 
where the remote user's e-mail address is smith@xyz..univ.edu 
and user entries are indicated in boldface type.

harriet_51%  ftp ftp.ncsa.uiuc.edu
Connected to zaphod.
220 zaphod FTP server (Version 4.173 Tue Jan 31 08:29:00 CST 
1989) ready.
Name (ftp.ncsa.uiuc.edu: smith):  anonymous
331 Guest login ok, send ident as password.
Password:  smith@xyz.univ.edu
230 Guest login ok, access restrictions apply.
ftp>  get README.FIRST
200 PORT command successful.
150 Opening ASCII mode data connection for README.FIRST (10283 bytes).
226 Transfer complete.
local:  README.FIRST  remote:  README.FIRST
11066 bytes received in .34 seconds (32 Kbytes/s)
ftp>  quit
221 Goodbye.
harriet_52%

NCSA HDF documentation, program, and source code are now in the 
public domain. You may copy, modify, and distribute these files as 
you see fit.


Archive Server

To obtain NCSA software via an archive server:

1.  E-mail a request to:

	archive-server@ncsa.uiuc.edu

2.  Include in the subject or message line, the word "help."

3.  Press RETURN.

4.  Send another e-mail request to:

	archive-server@ncsa.uiuc.edu

5.  Include in the subject or message line, the word "index."

6.  Press RETURN.

For example, if you use the UNIX mailing system, your login 
session should resemble the following sample, where user entries 
are indicated in boldface type.

yoyodyne_51% mail archive-server@ncsa.uiuc.edu
Subject:  help

.
EOT
Null message body; hope that's ok
yoyodyne_52% mail archive-server@ncsa.uiuc.edu
Subject:  index
.
EOT
Null message body; hope that's ok


The information you receive from both the help and index 
commands will give you further instructions on obtaining NCSA 
software. This controlled-access server will e-mail the 
distribution to you one segment at a time.

U.S.Mail

Like other NCSA software, NCSA HDF is also available for 
purchaseÑeither individually or as part of the anonymous FTP reel 
or cartridge tapesÑthrough the NCSA Technical Resources Catalog. 
Orders can only be processed if accompanied by a check in U. S. 
dollars made out to the University of Illinois. To obtain a catalog, 
contact:

NCSA Documentation Orders
152 Computing Applications Building
605 East Springfield Avenue
Champaign,  IL  61820
(217) 244-0072