Guide to FSLDOC Metadata Collection Extension for ArcView

Extension Script and User's Manual

Written by George Lienkaemper


For information about this Extension, please contact George Lienkaemper, FSL 127

Introduction

 This extension was originally developed for NOAA. Our modifications of the program were strictly to make the tool more closely match our metadata needs. ArcView is being used by an increasing number of researchers and having a tool for collecting metadata that runs from within the software makes good sense.

The output product of FSLDOC is FGDC Standard-compliant. At your option three files can be created: an ascii file, and HTML file that can be read by a web browser, and for themes originating as ARC/Info coverages, an INFO file that travels with the coverage. Output files maintain the theme name with a .met or .htm extension. Additionally, themes from ARC/Info coverage will have embedded letters in the file name to indicate polygon (p), line (l), or point (pt) feature types. Themes originating as ARC/Info regions or routes will be designated by the region or route sub-class name with a (p) for regions or an (l) for routes.

One way that this tool improves on the ARC/Info DOCUMENT tool is that you have an opportunity to add ascii files at numerous steps along the way - as opposed to being forced to struggle with VI or EMACS. You can prepare your narrative in any word processor and save the output to ascii.

The program collects metadata in sections, corresponding to the sections in the metadata Standard. You have the option of interrupting your documentation process at the end of each section. When you resume, collection will begin where you left off. If you leave the program without completing a section, that section will have to be done over.

The program creates and accesses numerous .dbf files in your "working directory". The working directory is set in the Project -Properties menu. If you have many themes in different projects to document, you'll be well served to establish one working directory. We're hopeful you make use of /data/cascade/metadata as a working directory.

The .dbf files are accessed each time you run a particular section and earlier entries are available as short cuts (very handy for contact name and addresses or cross-references to other data sets). A number of standard entries have been set up in the /data/cascade/metadata working directory. In many cases you'll be asked if the information should be read from an existing file. The reference in these cases is to an ascii file you've prepared in advance. In almost every case we recommend you use this option (in some cases, it's the only option), primarily because we've found it's easier to prepare complete and well thought-out responses at a word processing prompt than trying to fit it into a message box. AND if you ever have to reproduce your entries, you have them at hand.

Finally as onerous as it seems, we recommend that you have at your finger tips the "Content Standard for Digital Geospatial Metadata" produced by the Federal Geographic Data Committee. Having it available will not only help clarify what sort of responses are expected to the elements, but will also provide a better understanding of how the Standards are structured.

See the Help window for additional information.


The Extension

FSLDOC can be added to any project from the File-Extensions menu. (For the time being, we recommend that you load the extension after the View is established in your project and then unload it before you Exit or Save. That way, any changes made to FSLDOC between ArcView sessions won't wreck your project.) The tools are run from the View document. You should have themes in the View and the map units set in View-Properties. There are three new buttons associated with FLSDOC. They appear at the far right end of the button bar. The BUG runs the metadata collection program; the "H" brings up the Help Screen, and the "Glasses" provide a look at the ascii output file. Click on the BUG to start.

TIP!! - Move around each Input Box with the Tab key. Return is the same as OK and there is no provision for backing up. Slow Down until you have some muscle memory for the Tab.

Read the Helpful Hints; then select the theme to document form the list.

 

Section 1 Identification Information

Collect Metadata - most of the time this will be the first time you collect metadata for this theme. However, if you bailed out of the program or just stopped collecting after a specific section; how you answer this question may be important. If you say "yes", the program looks for the appropriate .met file, searches for the sections you've completed, and begins at the next section. If you say "no", the program assumes you want to start over. (Don't worry there are chances to change your mind)

Publication Date and Title - for our purposes 'publication' is when the data set is ready for general use. There will be some apparent conflicts later, but it's mostly semantics.

Originator - usually that's you. If you're documenting data for a another researcher, enter the name of the person who developed the data set. Use your username as the Originator_Id.

Publication Information - if these data are actually published in a paper, dissertation, or thesis; enter the appropriate information. Other possible responses are the WWW or unpublished.

Other Citation Details - this is the place to enter the full citation to your paper, dissertation, or thesis. Put it in an ascii file and read it in here.

On-Line Linkage - if you have this data set on the web (or plan to) here is the place to say so.

Abstract - you can prepare an ascii file containing your abstract or you can enter one into the input box.

Purpose - see abstract.

Procedures - this is the big one! You'll need to prepare an ascii file for this element. Essentially this is a narrative of how this data set was developed. From field collection to the current documentation process, you should outline the steps you took and the analyses you did. Ostensibly this outline would allow someone else to repeat your steps and get the same results. That may not truly be possible, but it is the goal. Having a copy of the ARC/Info log file for your theme can be very helpful here. The program will actually let you skip this step, but no one will pass you when they review your metadata if you've skipped this step.

Revisions - if you've made several revisions of this data set here is the place to tell all. This does not mean listing each Arcedit session while your were editing your data. This step is most important if the data have been available for use by others and then revised in some way. Use an ascii file if you have several revisions. Otherwise fill in the blanks. If it's version one, say so.

Reviews - if someone else has reviewed your data set, list who did it and when in an ascii file. A review includes inspection of the LOG file for completeness and conformance with the steps outlined in the Procedures element, verification of table and item identities and definitions, validity of the reference data sets and citations and review of any quality assurance measures performed on the data set.

Related Tables - if you make use of related INFO files or linked .dbf files or make significant use of lookup tables in relation to this data set, list the full path and description of how the table is used (you can document the file later).

References - read in an ascii file with any references that are significant in the development of the data set. They should be in a standard citation format.

Time Period - Dates should be YYYYMMDD, year only is ok, so is unknown.

Currentness Reference - sort of an odd way to put it, but the bottom line is "the data are current based on _____". Examples might be 'field conditions', 'publication date', 'available data'.

Maintenance and Update Frequency - see Standard - irregular is popular.

Status - choose

Spatial Domain - If you choose to report this in geographic coordinates, remember by default ArcView selects the GRS1980 spheroid (NAD83); so if you have data in NAD27 you'll need to use a custom projection - including the 500000 False Easting.

Keywords - you're on your own - think of how you'd search for these data.

Constraints - note any legal constraints for access or use of the data

Contact - who to contact about the data set. The owner or someone who can answer questions about the data.

Credit - here is a chance to list individuals or organizations who played a significant role in developing the data set. We suggest that you edit a copy of the file /data/cascade/metadata/credit.txt, which includes an admonition to users of the data to properly cite the data set. Read this information in as an ascii file.

Cross Reference - here you can reference other data sets that are likely to be of interest.

Section 1 is complete!

In review here are the files you should have prepared in advance:

  

Section 2 Data Quality Information

Attribute Accuracy - if you've made some evaluation of the accuracy of your attribute value assignments, explain here.

Logical Consistency - an explanation of the fidelity of the relationships in the data set and the tests used - that's what the standard says! Try this on for size: Logical consistency verified with ARC/Info BUILD command.

Completeness - information about omissions, selection criteria, generalization, definitions used, and other rules used to derive the data set.

Horizontal and Vertical Positional Accuracy - if you have values for these elements from gps or other sources - report them here.

Lineage - the #2 biggie.

Identify source data used to construct the data set - you'll need the log file for your data set and perhaps some of those referenced in your lineage. As we get started you may not find your data sources listed in the cross-reference table, but as we develop more metadata, some of those layers will begin to appear. Also remember that these sources do not have to be digital data - many sources are paper maps. Nor do the sources even need to be maps - x,y coordinates are the source of many maps. EVERY DATA LAYER HAS A SOURCE!!!

You'll want to know the source scale, dates, currentness reference, and contribution to the data set. You'll also want to detail the processing steps applied to the source information that led to the data layer you're now documenting. This is where the log file comes in. You can actually cut and paste the steps listed in the log file, but you might want to add a little more explanation of what was done. If you are adding a significant process step that might require more explanation, be sure to list a Process Step Contact. Primarily this contact is the person who can explain what was done and why. If the same person does all the processing, it's all right to add them as a contact after the last Process Step.

Cloud Cover - important only for images.

Section 2 is Complete.

Information you'll need to have in hand for section 2:

  

Section 3 Spatial Data Organization

Relax!! It's automatic! 

 

Section 4 Spatial Reference Information

If you're using an ARC/Info Coverage with a projection defined the program will automatically access it. UTM Zone 10 is already on the list of defined projections. For shapefiles you'll have to define the projection using arcview. Remember, the software will default to the GRS1980 spheroid.

Resolution - most significant for grids or images, but go ahead and enter 1 for vector data. Units are captured automatically.

Section 4 is complete.

All you need is to know your projection parameters.

 

Section 5 Entity and Attribute Information

We consider this to be a very important part of the documentation process. Here is where you record what an attribute represents and what the codes mean.

You'll be prompted by message boxes for all the information in this section. There are a couple of quirks that may be ironed out eventually, but that you'll have to put up with for a while.

You will not be prompted for definitions of standard attributes that are created with ARC/Info coverages and ArcView shapefiles, but you will be prompted for measurement units. If there are none, click on CANCEL.

Enter the Attribute Definition - you have 50 characters, so don't be fooled by the length of the line in the input box.

If the attribute is coded, you'll get an input box that will accommodate 10 codes and descriptions. This is an arbitrary number. If you have more codes than 10, the solution right now is to edit the output file to add them. If the attribute is not coded, you'll be prompted for measurement units. If there are none, click on CANCEL.

Next you can choose to document any associated INFO or .dbf file. The same procedures apply.

Section 5 is complete.

Helpful information for this section is all those notes you've been keeping to help you remember your data codes. Also if you've used the fsldoc.aml, bring the output file up in another window so you can cut and paste that information you've already struggled to type in.

 

Section 6 Distribution Information

The Contact that is entered here is the person who should be contacted to get this data set. Generally this will be the person who sees that the data are prepared for transfer, who will move it to the transfer site, and who will follow through in making sure the transfer is complete. Generally it is the 'data librarian' for your project. Whomever you put in this contact screen, make sure they know they've been tapped!

Resource information - the resource description is the path to where the data set actually resides.

Distribution liability - you don't want any! We have a disclaimer hard coded into the 'Standard'. We may develop a stronger one.

Format - see the Standard for the full domain of choices. ARCE is ARC/Info export format.

Network address - this is the address of our ftp server.

Network resource - the ftp directory where the files will reside for transfer.

Access Instructions - tell the potential receiver of the data how to get it. Anonymous ftp should be good enough, but if you want the owner to be informed of the transfer, here is the space to say so.

Fees - in general we don't change for data.

If you're willing to make the data available off-line you can put the specifications in on the Media Information message box. The usual recording format is tar.

Section 6 is complete.

We suggest that you stop here and not go on to Section 7 until you've had some sort of review of you metadata effort. Once you're satisfied that the metadata are correct and complete, you can finish by adding section 7.

 

Section 7 Metadata Reference Information

Enter the Date you completed the metadata collection and the date of the review.

Next, enter the pertinent information about YOU, the documenter of the data set.

Finally you get to choose if you want to create an HTML file or an INFO file attached to an ARC/Info coverage.




FSRN Home Page / GIS/RS Helpdesk Page / GIS/RS Intro Page