5.1 NCSA HDF Calling Interfaces and Utilities Annotating Data Objects and Files 5.1 National Center for Supercomputing Applications March 1993 5.1 NCSA HDF Calling Interfaces and Utilities Annotating Data Objects and Files 5.1 National Center for Supercomputing Applications March 1993 Chapter 5 Annotating Data Objects and Files Chapter Overview Annotation Tags File Annotations HDF Object Annotations Tags and Reference Numbers DFTAG_NDG, DFTAG_SDG, and backward compatibility The Annotation Interface Writing Annotations for HDF Objects Reading Annotations for HDF Objects Listing All Labels for a Given Tag Writing Annotations for HDF Files Reading Annotations for HDF Files Getting Annotation Information from a File Chapter Overview This chapter describes the routines that are available for storing and retrieving data and file annotations. Annotation Tags It is often useful to associate information in text form about an HDF file and its data contents, and to keep that information in the same file that contains the data. HDF provides this capability in the form of annotations. An HDF annotation is a sequence of ASCII characters that is associated with one of three types of objects: 1. the file itself, 2. the individual HDF data objects in the file, or 3. the tags that identify the data elements. The current annotation interface supports only the first two types of annotation. HDF annotations can accommodate a wide variety of types of information, including titles, comments, variable names, parameters, formulas, and source code. Any textual information that a user might normally put into a notebook concerning the collection, meaning, or use of a file or data can be put into a file's annotations. Annotations are optionally supplied by a creator or user of an HDF file or data object. Annotations come in two forms: labels and descriptions, defined as follows: label a sequence of ASCII characters, except the character NULL (0).1 description any sequence of characters, including NULL. File Annotations Any HDF file can have labels (called file IDs) and descriptions stored in them. There are routines in the annotations interface specifically designed for reading and writing file IDs and file descriptions. HDF Object Annotations The annotation of HDF data objects is complicated by the fact that you have to uniquely identify the objects being annotated. In order to understand how annotations work for data objects, you need to know a little bit about how HDF data objects are structured and identified within an HDF file. HDF data objects are the basic building blocks of HDF files. An HDF data object has two parts: a 12-byte Data Descriptor (DD) and a data element. A DD has four fields: a tag, a reference number, a 32-bit data offset, and a 32-bit data length. (The latter two are unimportant here.) Taken together, the tag and reference number for a data object uniquely identify that object. Hence, the data object that a particular annotation refers to can be identified by storing the object's tag and reference number together with the annotation. Note that an HDF annotation is itself a data object, so it has its own DD. This DD has a tag and a ref number, and it points to the data element that constitutes the annotation. The data element that goes with an annotation contains three things: * the tag of the object that it is an annotation for, * the ref of the object that it is an annotation for, and * the annotation itself. For example, suppose you have an HDF file that contains three scientific datasets (SDS). Each SDS has its own DD consisting of the SDS tag DFTAG_NDG, and a unique reference number as illustrated in Figure 5.1. ED. NOTE: Figures are not available in this plain text version of the specification. Figure 5.1 Three SDS Tags with Their Ref Numbers Suppose you wish to annotate the second SDS by storing the following annotation with it in the file: "Data from black hole experiment 8/18/87." This text would be stored in an HDF file as an annotation, and it would have stored with it the tag DFTAG_NDG and reference number 4. Figure 5.2 illustrates how the annotation would look in the file. Figure 5.2 Displayed Example of SDS, Ref #, and Annotation Tags and Reference Numbers Note that in order to use annotation routines, you need to know the tags and reference numbers of the objects you wish to annotate. Tag numbers are listed in Appendix A, "NCSA HDF Tags." Special routines are available for obtaining the reference numbers of certain tags, including tags for SDSs, Raster Image Sets, palettes, and annotations. These are: DFSDlastref, DFR8lastref, DFPlastref, and DFANlastref. They return the most recent reference number used in either reading or writing the corresponding data object. Reference numbers for objects other than these can be obtained with the routine DFfindnextref, a general purpose HDF routine. Usage of DFfindnextref is illustrated later in an example. See this chapter's section, "Example: Reading a Label and Description." DFTAG_NDG, DFTAG_SDG, and backward compatibility. In versions of HDF that predate HDF 3.2, an SDS could only support 32-bit floating point numbers, and native Cray floating point numbers. The HDF tag that identified the old (pre-HDF 3.2) SDS, was DFTAG_SDG. In order to support several new number types and at the same time make it easier to add new features (e.g. data compression) to future versions of SDS, a new structure for the SDG (scientific data group) was implemented. The new structure is called NDG, for "numeric data group", and has the tag DFTAG_NDG (720). (The NDG structure is described in detail in the manual "HDF Specifications.") If you have used the annotation interface to annotate Scientific Data Sets with versions of HDF that precede HDF 3.2, chances are you associated your annotation with DFTAG_SDG. DFTAG_SDG is still associated with all SDSs that contain 32-bit floats, but DFTAG_NDG is associated with all SDSs. Therefore, as illustrated in several examples in this chapter, it is now recommended that you use DFTAG_NDG for attaching annotations to SDSs. On the other hand, there may be situations in which you would want to have your program look for SDS annotations associated with DFTAG_SDG as well as DFTAG_NDG. These include: * if SDS annotations might have been written using an older version of the HDF library, or * if SDS annotations might have been written using a new version of the HDF library, but by a program that used DFTAG_SDG, rather than DFTAG_NDG. The Annotation Interface The HDF library provides two types of routines for storing and retrieving annotations: (1) routines for file IDs and file descriptions, and (2) routines for HDF data objects. Table 5.1 lists the C and FORTRAN names of annotation routines currently contained in the HDF library. The following sections provide descriptions and examples of these calling routines. Table 5.1 Long and Short Names for Annotation Routines C FORTRAN Name Name Purpose DFANputlabel daplab puts label of tag/ref. DFANputdesc dapdesc puts description of tag/ref. DFANgetlablen dagllen gets length of label of tag/ref. DFANgetlabel daglab gets label of tag/ref. DFANgetdesclen dagdlen gets length of description of tag/ref. DFANgetdesc dagdesc gets description of tag/ref. DFANlablist dallist gets list of labels for a particular tag. DFANaddfid daafid adds file ID . DFANaddfds daafds adds file description. DFANgetfidlen dagfidl gets file ID length. DFANgetfid dagfid gets file ID. DFANgetfdslen dagfdsl gets file description length. DFANgetfds dagfds gets file description. DFANlastref* dalref returns ref of last annotation read or written. Writing Annotations for HDF Objects DFANputlabel FORTRAN: INTEGER FUNCTION daplab(filename, tag, ref, label) CHARACTER*(*) filename - name of HDF file to put label in CHARACTER*(*) label - label to write to the file INTEGER tag, ref - tag/ref of item whose label we want to store C: int daplab(filename, tag, ref, label) char *filename; /* name of HDF file to put label in */ uint16 tag, ref; /* tag/ref of item whose label you want to store*/ char *label; /* label to write to the file */ Purpose: To write out a label for the data object with the given tag/ref. Returns: 0 on success; Ð1 on failure. DFANputdesc FORTRAN: INTEGER FUNCTION dapdesc(filename, tag, ref, desc, desclen) CHARACTER*(*) filename - name of HDF file to put descr in CHARACTER*(*) desc - description to write to the file INTEGER tag, ref - tag/ref of item whose description you want to store INTEGER desclen - length of description C: int dapdesc(filename, tag, ref, desc, desclen) char *filename; /* name of HDF file descr stored in */ uint16 tag, ref; /* tag/ref of item whose descr you want to store */ char *desc; /* description to write to file */ int32 desclen; /* length of description */ Purpose: To write out a description for the data object with the given tag/ref. Returns: 0 on success; Ð1 on failure. The parameter desclen gives the length of the description that is to be written out. This is necessary because there is no simple way to tell the length of a description that can contain NULL characters without explicitly giving its length. Example: Adding Annotations to a Scientific Dataset The example in Figure 5.3 illustrates the use of DFANputlabel and DFANputdesc to write to a file a label and description for an SDS. The HDF object that contains a SDS is called a numeric data group (NDG). An NDG is a group of tag/refs that make up a SDS. The tag for an NDG is DFTAG_NDG; it's numeric value is 720. Figure 5.3 Adding Annotations to a SDS FORTRAN: integer dsadata, daplab, dapdesc, dslref integer ret, lref, shape(2), DFTAG_NDG real*4 dataset(2,5) parameter (DFTAG_NDG = 720) ... ret = dsadata('myfile',2,shape,dataset) lref = dslref() ret = daplab('myfile', DFTAG_NDG, lref, 'testlab') ret = dapdesc('myfile', DFTAG_NDG, lref, $ 'This is a test', 14) C: #include "hdf.h" ... int lref, rank, dimsizes[2]; char s[50]; float *data; ... DFSDadddata("myfile",rank,dimsizes,data); ... sprintf(s,"Data from black hole experiment\n8/18/87"); lref = DFSDlastref(); DFANputlabel("myfile", DFTAG_NDG, i, "black hole"); DFANputdesc("myfile", DFTAG_NDG, i, s, strlen(s)); Remarks: * The call to dslref (DFSDlastref) returns the reference number of the SDS last written to the file. This call is needed to complete the tag/ref combination that uniquely identifies the desired scientific data group that is being annotated. * The file hdf.h that is included with the C source contains the number that corresponds to the tag for a SDS. In the FORTRAN program, this number is defined using a parameter statement. Tags and their numbers are listed in Appendix A, "NCSA HDF Tags. Reading Annotations for HDF Objects DFANgetlablen FORTRAN: INTEGER FUNCTION dagllen(filename, tag, ref) CHARACTER*(*) filename - name of HDF file label is stored in INTEGER tag, ref, - tag/ref of item whose label you want C: int32 DFANgetlablen(filename, tag, ref) char *filename; /* name of HDF file label is stored in */ uint16 tag, ref; /* tag/ref of item whose label you want */ Purpose: To get the length of a label of the data object with a given tag and reference number. This routine allows you to insure that there is enough space allocated for a label before actually loading it. Returns: The length of label on success; Ð1 on failure. DFANgetlabel FORTRAN: INTEGER FUNCTION daglab(filename, tag, ref, label, maxlen) CHARACTER*(*) filename, - name of HDF file label is stored in CHARACTER*(*) label - space to return label in INTEGER tag, ref - tag/ref of item whose label you want INTEGER maxlen - size of space to return label in C: int DFANgetlabel(filename, tag, ref, label, maxlen) char *filename; /* name of HDF file label is stored in */ uint16 tag, ref; /* tag/ref of item whose label you want */ char *label; /* space to return label in */ int32 maxlen; /* size of space to return label in */ Purpose: To read in the label of the data object with the given tag and reference number. Returns: 0 on success; Ð1 on failure. The parameter maxlen gives the amount of space that is available for storing the label. The length of maxlen must be at least one greater than the anticipated length of the label, because a NULL byte is appended to the annotation. DFANgetdesclen FORTRAN: INTEGER FUNCTION dagdlen(filename, tag, ref) CHARACTER*(*) filename - name of HDF file descr is stored in INTEGER tag, ref - tag/ref of item whose descr you want C: int32 DFANgetdesclen(filename, tag, ref) char *filename; /* name of HDF file descr is stored in */ uint16 tag, ref; /*tag/ref of item whose descr you want */ Purpose: To get the length of a description of the data object with the given tag and reference number. This routine allows you to insure that there is enough space allocated for a description before actually loading it. Returns: The length of description on success; Ð1 on failure. DFANgetdesc FORTRAN: INTEGER FUNCTION dagdesc(filename, tag, ref, desc, maxlen) CHARACTER*(*) filename - name of HDF file descr is stored in CHARACTER*(*) desc - space to return description in INTEGER tag, ref - tag/ref of item whose descr you want INTEGER maxlen - size of space to return descr in C: int DFANgetdesc(filename, tag, ref, desc, maxlen) char *filename; /* name of HDF file descr is stored in */ uint16 tag, ref; /*tag/ref of item whose descr you want */ char *desc; /*space to return description in */ int32 maxlen; /*size of space to return descr in */ Purpose: To read in the description of the data object with the given tag and reference number. Returns: 0 on success; Ð1 on failure. The parameter maxlen gives the amount of space that is available for storing the description. The length of maxlen must be at least one greater than the anticipated length of the description, because a NULL byte is appended to the annotation. Example: Reading a Label and Description This example program (Figure 5.4) illustrates the use of DFANgetlabel, DFANgetdesclen, and DFANgetdesc to read from an HDF file a label and description for an SDS. Figure 5.4 Getting Annotations from a SDS FORTRAN: program getanntest C Program to test routines for reading a label and description integer daglab, dagdesc, dagdlen, dslref, dsgdata integer desclen, lastref, ret, dims(2), rank real*4 data(2,5) integer DFTAG_NDG character*20 label character*400 desc parameter (DFTAG_NDG = 720 ) C************** find ref of first SDS in file **************** ret = dsgdata('myfile', rank, dims, data) lastref = dslref() C************** get label, then description **************** ret = daglab('myfile', DFTAG_NDG, lastref, label, 20) print *,'Label: ', label desclen = dagdlen('myfile', DFTAG_NDG, lastref) if (desclen .gt. 400) then print *, 'Description too long. More than 400.' stop endif ret = dagdesc('myfile',DFTAG_NDG, lastref, desc, desclen) print *,'Description: ', desc stop end Figure 5.4 Getting Annotations from a SDS (Continued) C: /* * Program to test routines for reading a label and description */ #include "hdf.h" main() { uint16 lastref; char label[20], *desc; float data[2][5]; int rank, dims[2], desclen; DFSDgetdata("myfile", rank, dims, data); lastref = DFSDlastref(); /*** get label, then description ***/ DFANgetlabel("myfile", DFTAG_NDG, lastref, label, 20); printf("Label: %s\n", label); desclen = DFANgetdesclen("myfile", DFTAG_NDG, lastref); if (desclen < 0) { printf("Error reading description length.\n"); exit(1); } else { desc = (char *) malloc( desclen+1); } DFANgetdesc("myfile", DFTAG_NDG, lastref, desc, desclen); printf("Description: %s\n", desc); } Remarks: * Lower level routines Hopen, Hclose, and DFfindnextref are used here to find the ref number of the first occurrence of the NDG (scientific data group) tag. * In the above example, DFANgetlabel assumes that the label is not more than 20 bytes long. If the program needs to know the length of a label, it can call DFANgetlablen to find this out. * Since the description could be very long, the routine DFANgetdesclen is called to find the space requirements for the description. In the C program, this space is allocated before calling DFANgetdesc to get the description. In the FORTRAN program, it is assumed to be 400 bytes or less. Listing All Labels for a Given Tag DFANlablist FORTRAN: INTEGER FUNCTION dallist(filename, tag, reflist, labellist, listsize, maxlen, startpos) CHARACTER*(*) filename - name of HDF file labels stored in CHARACTER*(*) labellist - array of strings to place labels in INTEGER tag - tag to find labels for INTEGER reflist(*) - array to place refs in INTEGER listsize - size of ref and label lists INTEGER maxlen - maximum length allowed for label INTEGER startpos - entries will be returned beginning from the startpos entry up to the listsize entry. C: int DFANlablist(filename, tag, reflist, labellist, listsize, maxlen, startpos) char *filename; /* name of HDF file labels stored in */ uint16 tag; /* tag to find labels for */ unit16 reflist[]; /* array to place refs in */ char *labellist; /* array of strings to place labels in */ int listsize; /* size of ref and label lists */ int maxlen; /* maximum length allowed for label */ int startpos; /* entries are returned starting from the startpos entry up to the listsize entry */ Purpose: To return a list of all reference numbers and labels (if labels exist) for a given tag. Returns: The number of reference numbers found on success; Ð1 on error. Listsize gives the number of available entries in the ref and label lists, maxlen is the maximum length allowed for a label, and startpos tells which label to start reading for the given tag. (If startpos is 1, for instance, all labels will be read; if startpos=4, all but the first 3 labels will be read.) Reflist contains a list of reference numbers of all objects with a given tag. Labellist contains a corresponding list of labels, where they exist. If there is no label stored for a given object, the corresponding entry in labellist is an empty string. Taken together, the reflist and labellist returned by DFANlablist constitute a directory of all objects and their labels (where they exist) for a given tag. The list, labellist, can be displayed to show all of the labels for a given tag. Or, it can be searched to find the ref of a data object with a certain label. Once the ref for a given label is found, the corresponding data object can be accessed by invoking other HDF routines. Hence, this routine provides you with a mechanism for direct access to data objects in HDF files. Example: Getting a List of Labels for Images in a File The example in Figure 5.5 illustrates the use of DFANlablist to get a list of all labels used for SDSs in an HDF file. Figure 5.5 Getting a List of Labels from a File FORTRAN: program getlablist integer dallist integer i, nlabels, startpos, listlen, reflist(20) integer DFTAG_NDG, LISTSIZE, MAXLEN character*15 labellist(20) parameter (DFTAG_NDG = 720, * LISTSIZE = 20, * MAXLEN = 15 ) startpos = 1 nlabels = dallist('myfile',DFTAG_NDG, reflist, * labellist, LISTSIZE, MAXLEN, startpos) do 100 i=1,nlabels print *,' Ref number: ',reflist(i), * ' Label: ',labellist(i) 100 continue stop end program getlablist integer dallist integer i, nlabels, startpos, listlen reflist(20) integer DFTAG_NDG, LISTSIZE, MAXLEN character*15 labellist(20) parameter (DFTAG_NDG = 720, * LISTSIZE = 20, * MAXLEN = 15 ) startpos = 1 nlabels = dallist('myfile',DFTAG_NDG, reflist, * labellist, LISTSIZE, MAXLEN, startpos) do 100 i=1,nlabels print *,' Ref number: ',reflist(i), Label: ',labellist(i) 100 continue stop end C: #include "hdf.h" #define LISTSIZE 20 #define MAXLEN 15 main() { int i, nlabels, startpos=1, listlen=10; uint16 reflist[LISTSIZE]; char labellist[MAXLEN*LISTSIZE+1]; nlabels = DFANlablist("myfile",DFTAG_NDG, reflist, labellist, listlen, MAXLEN, startpos); for (i=0; i= 0) { ret = DFANgetfid(fileid,inlabel, MAXLABLEN, NOTFIRST); printf("\nLabel: %s", inlabel); length = DFANgetfidlen(fileid, NOTFIRST); } /* read description length and description from file */ length = DFANgetfdslen(fileid, FIRST); ret = DFANgetfds(fileid,indescr, MAXDESCLEN, FIRST); printf("\n\nDescription: \n%s\n", indescr); Hclose(fileid); } Remarks: * These annotations are associated with the file, not with any particular object within the file. * We use the general purpose routines Hopen and Hclose. These routines do not open and close HDF files for you. You must do it explicitly. The value DFACC_READ is defined in hdf.h. Hence the "#include hdf.h" in the C program. It is assumed that the FORTRAN cannot perform such an include, so DFACC_READ is defined with a PARAMETER statement. Getting Annotation Information from a File DFANlastref FORTRAN: integer dalref() C: int DFANlastref() Purpose: To return the most recent reference number of a written or read annotation. Returns: The reference number on success; Ð1 on error. 1 Since NULL is used to terminate C strings, its use in the middle of a label has been rule out. * DFANlastref is callable only by C routines. There is no equivalent FORTRAN routine in the HDF library.