.|  Baltimore Ecosystem Study
Baltimore Ecosystem Study Data and Metadata
 


BES Data Management Projects
  1. BES metadata via the LTER Metacat server Also see "XML".
  2. Data management policy
  3. Ecological Metadata Language (EML) and text versions of BES metadata with links to the data files. Also see "XML".
  4. Extensible Markup Language (XML)
  5. Geodatabase
  6. If you would like more information, please contact Jonathan M. Walsh.

Forested meteorology setup and laptop, 2007. Photo: Dan Dillon
Overview
 
The Baltimore Ecosystem Study involves the collection and analysis of an abundance of data. These data take on many forms and they cover a wide variety of spatial and temporal extents. This leads to exciting challenges:
 
  • The safekeeping, quality assurance, storage, and curation of the data;
  • Developing a means to share this large compendium of differing types of data amongst our scientists and with the larger scientific community;
  • Mixing three dimensional data and point data: Neighborhood boundaries, for example can be mapped with stream nutrient values;
  • Combination of social data and physical data: Doing so allows us to look at cultural and social constructs alongside biological, geographical and chemical properties in three dimensions;
  • Documentation of the data so that it can be shared - by machines and by humans.
Metadata
 
Metadata can be described as "data about data". It has many definitions. This text will describe the definition used for BES. Simply, in order to produce and use data, we must have a means of describing it. For a simple example, if a photograph is the data, the metadata would tell us where it was taken, what camera, and what settings were used.
 
For the case of more comprehensive metadata, imagine a set of temperature readings for a stream. The metadata, if it were to be useful, would involve many things: What instrument was used? How was it calibrated? What is the expected quality or accuracy of the readings? How is the data file arranged? What are the units? What is the location of the stream? Additionally, what is the temporal extent of the data - that is, where does it fit in time?
 
And now let's imagine adding a spatial component to the data. Whether or not the data would be characterized or classified as "spatial" in nature, it still should be georeferenced. That adds a new component to the metadata - that of spatial orientation: Where does it fit in space?
 
At BES we also strive to participate in larger efforts to define a core set of required metadata components so that all these attributes can be collected for each dataset. This will lead to an ontology, or set of uniform key terms across ecological science to better facilitate collaboration and synthesis.
 
As the systems we use to share information become more sophisticated and as the number of datasets and their breadth increases, the metadata will also increase in sophistication, number and breadth. Similarly, as the systems we use become more automated, the metadata will also become more automated.
 

Wireless remote sensor in lab, 2007. Photo: Kathy Szlavecz
Metadata need not only be written for humans. Think of the methods section of a scientific paper. That is for humans. But metadata can be created for the exclusive use of machines, too. Just as the methods section of the paper puts it in context and gives it a value, metadata written for machines allows the data to be shared and seen by systems that can combine it with other data, creating new value.
 
In turn, combinations of metadata with computer logic that can ask the proper questions can point out more complex combinations of information. This is the notion of artificial intelligence, where a collection of information is looked at by a computer generated "inference engine" which can "learn" and adapt to answer more complex questions.
 
This allows for the process of data discovery and allows for automated collaboration and synthesis of data among researchers. Extensible Markup Language (XML) has proven useful for this. XML lends itself well to discovery. Information written in XML format can also be "filtered" so that it is easily read by humans. XML also allows us to better facilitate a means of "semantic discovery" of data, whereby searches can be made using natural language.
 
If the metadata system understands that a Baltimore neighborhood for which there are nitrogen flux data lies within the Gwynns Falls watershed, then it can handle a query for Gwynns Falls nitrogen flux and identify that data even though the specific metadata for that dataset doesn't even contain the phrase "Gwynns Falls".
 
So for the purposes of this section, metadata has a crucial role, and a single definition. Data are not of much value unless accompanied by metadata.
 

GIS data format: ARC/INFO export (*.e00)
Projection/datum/units: UTM Zone 18/NAD 83/meters
Compression: WinZip (pkunzip Windows XP and UNIX unzip will also extract the files)

The metadata provided for these data is a subset of the FGDC Content Standard for Digital Geospatial metadata.

Please contact Jonathan Walsh, BES Information Manager walshj@ecostudies.org with questions or comments.

This research was supported by funding from the NSF Long-term Ecological Research (LTER) Program. This material is based upon work supported by the National Science Foundation under Grant No. 1027188. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

These data are made available for scientific interests that agree to cite the data and source appropriately. There is a time between the collection of most of these data and their subsequent availability via this website in order to give the investigator adequate time to present findings in scientific literature.

Users are requested to notify the publisher of the purpose of data use and acknowledge the source if used in a publication. No commercial use is permitted. The publisher disclaims any responsibility for errors that may or may not exist within the data, and the user assumes all responsibility for errors in analysis or judgment associated with the data.