By Mission Area
By Science Topic
By Data Source
The USGS Science Data Catalog:
- Meets White House Open Data reporting requirements for USGS
- Provides a search and discovery tool that allows for metadata retrieval, visualization, download, and linking back to original data providers
- Offers a single source for USGS to serve its metadata to data.doi.gov, Data.gov, Office of Management and Budget (OMB), etc.
- Helps ensure that USGS metadata meet minimum requirements
- Supports data managers in applying the Publish/Share element of the USGS Science Data Lifecycle Model
- Serves as a member node to the NSF Sponsored DataONE Project
Frequently Asked Questions (FAQ) about the Science Data Catalog
See the section, "How to Contribute to the Science Data Catalog" on the Help Page or contact firstname.lastname@example.org for assistance.
Metadata must be in Extensible Markup Language (XML) format and follow the Federal Geographic Data Committee's (FGDC) endorsed Content Standard for Digital Geospatial Metadata (CSDGM). In the future, the Science Data Catalog will accept metadata adhering to formats prescribed by the International Organization for Standardization (ISO) suite (e.g., 19115-1, 19115-2, 19119, 19111, etc.) Visit the USGS Data Management Web site for more information about metadata creation.
The SDC harvests metadata records from WAFs, SiteMaps, and ScienceBase weekly. It will take about a day for the SDC to re-index all records and so you will generally see updates the following day.
Updates or changes to metadata must be performed on the source XML metadata record that is harvested by the SDC. Update your metadata records where they are managed by the program or science center (WAF, SiteMap or ScienceBase). The SDC conducts a fresh harvest every time and therefore new metadata records will appear and deleted records will disappear.
myReports allows users to view the status of their metadata harvests. myReports provide useful information about the harvesting success of submitted records, showing invalid web links and other issues to help data providers improve data quality.
The total records harvested into the Science Data Catalog is displayed for each data contributor at the bottom left of each data contributor view. To view each individual record that have been harvested, click the numerical value next to "Total Records Harvested" or select the link to view Science Data Catalog Results.
The harvested results are displayed on the search page where users are able to view more information and other related resources about the records.
The total records with failed links is displayed for each data contributor at the bottom right of each data contributor view. To view each individual records with a message as to what failed please click the numerical value next to "Total Records with Failed Links" or the Harvest History link.
The Harvest History tab displays a harvest summary for a specific time stamp. If you want to view you failed links please click on the Failed Records tab.
For an example using the Upper Midwest Environmental Sciences Center Data contributor, the Failed Records tab displays a list of failed records along with the reasons they failed the validation process. Bad Links: If displays an associated link in the field, this means the labeled hyperlink is incorrect or not currently active. If the field only displays no onlink tag, this means there is no online_link attribute found in the metadata record.
If you wish to download the failed records as a comma separated value file you can click the download icon to prompt the file transfer. This file will contain the date processed, source file location and badlink field columns with the individual records as rows.
How to Search for USGS Data
The USGS Science Data Catalog provides three options to search for data. Users can search using keywords or by location and browse by science topic, data provider, or USGS Mission Area. Users can also save URL search strings to reproduce their query terms.
The Science Data Catalog search tab begins with a display of all datasets that are described in the Catalog.
Ordering, Sorting & Display of Results
Change the number of results per page and sort search results by specific type.
Types of Data Available
All records contain links via the colored button(s) beneath the description of the dataset. Sometimes the link will connect to a downloadable dataset, collection of datasets, and/or connect to data via APIs and map services.
The Catalog only searches for the exact search terms entered. For example, the term "non-native plants" will not bring up records that have similar search terms such as "non-indigenous plant" or "exotic plant." We suggest running searches that include known synonyms to ensure more relevant results.
Use the search box to type a specific term on which to search (e.g. bathymetry). The total number of results in the Catalog will have the term 'bathymetry' found in metadata fields such as title, abstract, keywords, etc.
From your set of results, you can:
- Browse through the listing for data.
- Use a Filter in the left column to narrow the search.
- Add an additional search term to 'bathymetry' in the text search box.
- Use the map to search for 'bathymetry' in a specific location.
- Delete the 'bathymetry' search term and start a brand new search.
Adding an Additional Query to Text Searches
Add an additional search term (e.g. Monterey Bay) and the results will narrow to include results with terms, 'bathymetry' AND 'Monterey Bay'.
The final results might be narrowed down as a result of including both terms in the search. If you are looking for a very specific concept, it use “quotation marks” around the phrase to search for the exact wording.
There are two ways to conduct a geospatial search on datasets:
- Geospatial keywords that describe the area of study (e.g. Alaska, Bakken Formation, Acadia National Park, Gulf of Mexico).
- Bounding box coordinates that give the limits of coverage of a dataset.
Geospatial keyword searches work best when searching for data from a very specific location. Research studies that are localized are usually described by metadata that contain those location terms in the Title, Abstract, and Keyword fields of the metadata. Search for geospatial keywords with the text search query.Bounding coordinates ("limit search by location")
Search against the bounding coordinates when looking for datasets for a more general geographic area, and not a very specific named place.
Draw a bounding box on the map to specify area of interest, or to use dropdown menus to select pre-defined areas for U.S. states, countries, and major water bodies.Limit Search By Location Examples:
Example 1 Example 2
How to Contribute to the Science Data Catalog
The decision to contribute USGS metadata records to the Science Data Catalog begins at the science center or program level. Follow the steps below to help ensure that metadata meet all requirements and configurations for inclusion into the Science Data Catalog.
The metadata record(s):
- Must be in Extensible Markup Language (XML) file format to be machine readable (i.e. not a word document or pdf).
- Must follow the Federal Geographic Data Committee's (FGDC) endorsed Content Standard for Digital Geospatial Metadata (CSDGM).
The USGS Data Management Website provides non-prescriptive data management guidance, best practices, tools, and resources in one convenient location. Learn to create metadata resources to contribute to the USGS Science Data Catalog. To learn more about metadata standards and metadata creation, see the About page.
The Science Data Catalog harvests metadata records from online sources. Metadata intended for inclusion in the Catalog should be organized in a single online source for harvest. Centers and programs can provide their metadata collections to the Catalog from their own public servers, or leverage a remote data management platform to manage and serve their records.
The Science Data Catalog can harvest metadata record(s) from three types of sources:
- Web Accessible Folder (WAF): A URL address that is an online public folder containing the metadata record(s) for harvest. Use this option if all metadata records are stored in one online location.
- Site Map XML: A URL address of a Site Map XML file that lists the URLs of metadata records hosted from multiple online locations. Use this option if the metadata are stored in different online locations or in a metadata catalog.
- ScienceBase: This is a collaborative data management platform for USGS and its partners which can host XML metadata on ScienceBase item pages. First contact email@example.com to get started.
Once your FGDC CSDGM or ISO XML metadata is hosted in a WAF or a sitemap please contact firstname.lastname@example.org to complete the connection to SDC. For ScienceBase, the ScienceBase Data Release Team will work with you to establish the connection upon release of your data in ScienceBase (to ensure that incomplete or unreleased records are not passed to the SDC). Please refer to the ScienceBase Data Release page for instructions.