<?xml version='1.0' encoding='UTF-8'?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <idinfo>
    <citation>
      <citeinfo>
        <origin>Travis W Nauman</origin>
        <origin>Michael C Duniway</origin>
        <pubdate>2020</pubdate>
        <title>Predictive soil property map: Soil pH</title>
        <geoform>Raster data</geoform>
        <pubinfo>
          <pubplace>Flagstaff, AZ</pubplace>
          <publish>U.S. Geological Survey</publish>
        </pubinfo>
        <onlink>https://doi.org/10.5066/P9SK0DO2</onlink>
        <lworkcit>
          <citeinfo>
            <origin>Travis W Nauman</origin>
            <origin>Michael C Duniway</origin>
            <pubdate>2020</pubdate>
            <title>A hybrid approach for predictive soil property mapping using conventional soil survey data</title>
            <geoform>journal manuscript</geoform>
            <pubinfo>
              <pubplace>Wiley Online Library</pubplace>
              <publish>Soil Science Society of America Journal</publish>
            </pubinfo>
            <onlink>https://doi.org/10.1002/saj2.20080</onlink>
          </citeinfo>
        </lworkcit>
      </citeinfo>
    </citation>
    <descript>
      <abstract>These data were compiled to demonstrate new predictive mapping approaches and provide comprehensive gridded 30-meter resolution soil property maps for the Colorado River Basin above Hoover Dam. Random forest models related environmental raster layers representing soil forming factors with field samples to render predictive maps that interpolate between sample locations. Maps represented soil pH, texture fractions (sand, silt clay, fine sand, very fine sand), rock, electrical conductivity (ec), gypsum, CaCO3, sodium adsorption ratio (sar), available water capacity (awc), bulk density (dbovendry), erodibility (kwfact), and organic matter (om) at 7 depths (0, 5, 15, 30, 60, 100, and 200 cm) as well as depth to restrictive layer (resdept) and surface rock size and cover. Accuracy and error estimated using a 10-fold cross validation indicated a range of model performances with coefficient of variation (R2) for models ranging from 0.20 to 0.76 with mean of 0.52 and a standard deviation of 0.12. Models of pH, om and ec had the best accuracy (R2 &gt; 0.6). Most texture fractions, CaCO3, and SAR models had R2 values from 0.5-0.6. Models of kwfact, dbovendry, resdept, rock models, gypsum and awc had R2 values from 0.4-0.5 excepting near surface models which tended to perform better. Very fine sands and 200 cm estimates for other models generally performed poorly (R2 from 0.2-0.4), and sample size for the 200 cm models was too low for reliable model building.  More than 90% of the soils data used was sampled since 2000, but some older samples are included. Uncertainty estimates were also developed by creating relative prediction intervals, which allow end users to evaluate uncertainty easily.</abstract>
      <purpose>The primary purpose of this data was to demonstrate a new workflow for creating soil property maps across the United States. However, some of these maps have potential to assist 1) land managers with decision making, 2) earth system modeling applications, and 3) future sampling to improve soil survey and future predictive mapping products. Soil properties were chosen to address relevant soils data needs such as concerns about erosion, salinity, and dust emissions. Uncertainty was characterized for every pixel with 95% prediction interval bounds and a relative prediction interval (RPI) metric that standardizes prediction intervals to the original training sample distribution for each model. The RPI values easily interpretable as values below 0.5 indicate low likelihood of error being higher than the global root mean squared error, and values exceeding 1.0 indicate more likelihood of error beyond global error summaries. In short, RPI values &lt; 0.5 are consistently pretty good; values up to 0.9 are probably still reliable but probably have some error, and values close to and above 1.0 should be regarded with suspicion and perhaps trigger field evaluation of estimates before use.</purpose>
      <supplinf>These data represent a wide variety of soil properties mapped for the Colorado River Basin above Lake Mead. The maps are based on models that vary in accuracy, and users should consult the validation error estimates before use. Maps of model uncertainty are also given that can help users determine if maps are likely to be accurate for a given area of interest. Models for properties at 200 cm depths all have very small sample sizes and are likely to be inaccurate and should not be used without specific field validation - they were developed for research purposes. There may be error not fully expressed in the validation metrics that is associated with the original input data. If a high degree of accuracy is needed, it is advisable to conduct a field validation sampling to check predictions before using this data, particularly when validation metrics are poor. Comprehensive validation statistics for all models are available online at https://github.com/usgs/Predictive-Soil-Mapping/blob/master/SoilSurvReconstrProperties/PerformanceSummarybyModel_repository.xlsx. More detailed describing the methods and analysis are provided in Nauman and Duniway (2020) and the project github repository at https://github.com/usgs/Predictive-Soil-Mapping/tree/master/SoilSurvReconstrProperties.Data users should read each metadata record and acquire the manuscript identified as the ‘Larger Work Citation’, or manuscripts identified as 'Cross Reference' to have a complete understanding of how these data were created and used. The data are specific to the uses identified above, as described in the ‘Larger Work Citation’, and any other use of these data would be inappropriate.</supplinf>
    </descript>
    <timeperd>
      <timeinfo>
        <sngdate>
          <caldate>2020</caldate>
        </sngdate>
      </timeinfo>
      <current>publication date</current>
    </timeperd>
    <status>
      <progress>Complete</progress>
      <update>None planned</update>
    </status>
    <spdom>
      <bounding>
        <westbc>-116.0000</westbc>
        <eastbc>-105.2000</eastbc>
        <northbc>44.0000</northbc>
        <southbc>33.3000</southbc>
      </bounding>
    </spdom>
    <keywords>
      <theme>
        <themekt>USGS Thesaurus</themekt>
        <themekey>maps and atlases</themekey>
        <themekey>soil sciences</themekey>
      </theme>
      <theme>
        <themekt>ISO 19115 Topic Categories</themekt>
        <themekey>geoscientificInformation</themekey>
      </theme>
      <theme>
        <themekt>USGS Biocomplexity Thesaurus</themekt>
        <themekey>calcium carbonate</themekey>
        <themekey>environmental conditions</themekey>
        <themekey>organic matter</themekey>
        <themekey>soil conductivity</themekey>
        <themekey>soil density</themekey>
        <themekey>soil properties</themekey>
        <themekey>soil texture</themekey>
        <themekey>soils</themekey>
      </theme>
      <theme>
        <themekt>None</themekt>
        <themekey>accuracy and error estimated</themekey>
        <themekey>available water capacity</themekey>
        <themekey>bulk density</themekey>
        <themekey>digital soil mapping</themekey>
        <themekey>electrical conductivity</themekey>
        <themekey>environmental raster layers</themekey>
        <themekey>erodibility</themekey>
        <themekey>gypsum</themekey>
        <themekey>interpolate</themekey>
        <themekey>predicitve modeling</themekey>
        <themekey>soil property maps</themekey>
        <themekey>machine learning</themekey>
        <themekey>predictive mapping</themekey>
        <themekey>predictive maps</themekey>
        <themekey>random forest models</themekey>
        <themekey>random forests</themekey>
        <themekey>restrictive layer</themekey>
        <themekey>rock</themekey>
        <themekey>sodium adsorption ratio</themekey>
        <themekey>soil forming factors</themekey>
        <themekey>soil pH</themekey>
        <themekey>soil property maps</themekey>
        <themekey>surface rock cover</themekey>
        <themekey>surface rock size</themekey>
        <themekey>texture fractions</themekey>
        <themekey>uncertainty</themekey>
      </theme>
      <theme>
        <themekt>USGS Metadata Identifier</themekt>
        <themekey>USGS:5e90b3b482ce172707ed76f7</themekey>
      </theme>
      <place>
        <placekt>Geographic Names Information System (GNIS)</placekt>
        <placekey>Arizona</placekey>
        <placekey>Colorado</placekey>
        <placekey>Nevada</placekey>
        <placekey>New Mexico</placekey>
        <placekey>Utah</placekey>
        <placekey>Wyoming</placekey>
        <placekey>Colorado River</placekey>
        <placekey>Hoover Dam</placekey>
      </place>
      <place>
        <placekt>None</placekt>
        <placekey>Colorado River Basin</placekey>
        <placekey>Colorado River Basin above Hoover Dam</placekey>
      </place>
    </keywords>
    <accconst>none</accconst>
    <useconst>none</useconst>
    <ptcontac>
      <cntinfo>
        <cntperp>
          <cntper>Travis W Nauman</cntper>
          <cntorg>U.S. Geological Survey</cntorg>
        </cntperp>
        <cntpos>Soil Scientist</cntpos>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>2290 S West Resource Blvd</address>
          <city>Moab</city>
          <state>UT</state>
          <postal>84532</postal>
          <country>US</country>
        </cntaddr>
        <cntvoice>928-556-7537</cntvoice>
        <cntemail>tnauman@usgs.gov</cntemail>
      </cntinfo>
    </ptcontac>
    <datacred>Production of this data was supported by the U.S. Department of Interior Bureau of Land Management and US Geological Survey Ecosystems Mission Area. We acknowledge the National Cooperative Soil Survey program and managing agency, the U.S. Department of Agriculture - Natural Resources Conservation Service (USDA-NRCS), for the incredible resources they have made available in the National Characterization Database (https://ncsslabdatamart.sc.egov.usda.gov/) and soil survey program. The databases are truly an invaluable resource. We also thank Dr. Skye Wills and Dr. Henry Ferguson for assistance in acquiring and preparing NASIS morphology field soil observations.</datacred>
    <crossref>
      <citeinfo>
        <origin>Stephen E. Fick</origin>
        <origin>Robert J. Hijmans</origin>
        <pubdate>2017</pubdate>
        <title>WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas</title>
        <pubinfo>
          <pubplace>Wiley Online Library</pubplace>
          <publish>Royal Meteorological Society</publish>
        </pubinfo>
        <onlink>https://doi.org/10.1002/joc.5086</onlink>
      </citeinfo>
    </crossref>
    <crossref>
      <citeinfo>
        <origin>Adam M. Wilson</origin>
        <origin>Walter Jetz</origin>
        <pubdate>2016</pubdate>
        <title>Remotely Sensed High-Resolution Global Cloud Dynamics for Predicting Ecosystem and Biodiversity Distributions</title>
        <pubinfo>
          <pubplace>PLOS Biology Online</pubplace>
          <publish>PLOS Biology</publish>
        </pubinfo>
        <onlink>https://doi.org/10.1371/journal.pbio.1002415</onlink>
      </citeinfo>
    </crossref>
    <crossref>
      <citeinfo>
        <origin>Zhengming Wan</origin>
        <pubdate>2013</pubdate>
        <title>Collection-6 MODIS Land Surface Temperature Products Users' Guide</title>
        <pubinfo>
          <pubplace>Sioux Falls, SD</pubplace>
          <publish>U.S. Geological Survey</publish>
        </pubinfo>
        <onlink>https://lpdaac.usgs.gov/documents/118/MOD11_User_Guide_V6.pdf</onlink>
      </citeinfo>
    </crossref>
  </idinfo>
  <dataqual>
    <attracc>
      <attraccr>No formal attribute accuracy tests were conducted</attraccr>
    </attracc>
    <logic>No formal logical accuracy tests were conducted</logic>
    <complete>Data set is considered complete for the information presented, as described in the abstract. Users are advised to read the rest of the metadata record carefully for additional details.</complete>
    <posacc>
      <horizpa>
        <horizpar>No formal positional accuracy tests were conducted</horizpar>
      </horizpa>
      <vertacc>
        <vertaccr>No formal positional accuracy tests were conducted</vertaccr>
      </vertacc>
    </posacc>
    <lineage>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Google Earth Engine via U.S. Geological Survey</origin>
            <pubdate>2017</pubdate>
            <title>Landsat 8 Top-of-atmosphere reflectance archive (Tier 1)</title>
            <geoform>Geotiff raster data</geoform>
            <pubinfo>
              <pubplace>Reston, VA</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C01_T1_TOA</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20130701</begdate>
              <enddate>20131030</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>Landsat 8 TOA Tier 1</srccitea>
        <srccontr>Derived band ratios were used in predicting soil maps.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>PRISM Climate Group</origin>
            <pubdate>2010</pubdate>
            <title>PRISM climate dataset - 30-yr Normals, Average monthly and annual precipitation and temperature conditions</title>
            <geoform>BIL files</geoform>
            <pubinfo>
              <pubplace>Corvallis, OR</pubplace>
              <publish>Oregon State University</publish>
            </pubinfo>
            <onlink>http://prism.oregonstate.edu/normals/</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>1981</begdate>
              <enddate>2010</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>ground condition</srccurr>
        </srctime>
        <srccitea>30-yr Normals</srccitea>
        <srccontr>These climate data were used to to create modeling covariates in a modeling process.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2017</pubdate>
            <title>Elevation Derivatives for National Applications (EDNA) Seamless Three-Dimensional Hydrologic Database</title>
            <geoform>ESRI raster data</geoform>
            <pubinfo>
              <pubplace>Reston, VA</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5066/F7TD9VTQ</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>1923</begdate>
              <enddate>2014</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>30-meter EDNA DEM</srccitea>
        <srccontr>These DEM data were used to create various topographic datasets used as model inputs and also in making soil maps.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2014</pubdate>
            <title>National Land Cover Database (NLCD) 2011 Land Cover Conterminous United States, Edition 4.0</title>
            <geoform>raster digital data</geoform>
            <pubinfo>
              <pubplace>Sioux Falls, SD</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5066/P97S2IID</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>2004</begdate>
              <enddate>2011</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>NLCD 2011</srccitea>
        <srccontr>These data were used in prediction of soil maps and also used for masking out water pixels.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey GAP Analysis Program</origin>
            <pubdate>2016</pubdate>
            <title>GAP/LANDFIRE National Terrestrial Ecosystems 2011, Version 3</title>
            <geoform>IGE raster data file</geoform>
            <pubinfo>
              <pubplace>Boise, ID</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5066/F7ZS2TM0</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>2010</begdate>
              <enddate>2011</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>ground condition</srccurr>
        </srctime>
        <srccitea>GAP</srccitea>
        <srccontr>Used in prediction of soil maps.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. National Cooperative Soil Survey</origin>
            <pubdate>2018</pubdate>
            <title>Soil Characterization Database (or Gridded SSURGO (gSSURGO) Database  (https://data.nal.usda.gov/dataset/national-cooperative-soil-characterization-database)</title>
            <geoform>File Geodatabase</geoform>
            <pubinfo>
              <pubplace>Lincoln, NE</pubplace>
              <publish>U.S. Department of Agriculture - Natural Resources Conservation Service</publish>
            </pubinfo>
            <onlink>https://nrcs.app.box.com/v/soils/folder/94124173798</onlink>
          </citeinfo>
        </srccite>
        <typesrc>spatial relational database</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20170101</begdate>
              <enddate>20170717</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>SCD</srccitea>
        <srccontr>Field data used to train spatial models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. National Cooperative Soil Survey</origin>
            <pubdate>2017</pubdate>
            <title>National Soil Information System Morphology Pedons</title>
            <geoform>Microsoft Access Database</geoform>
            <pubinfo>
              <pubplace>Lincoln, NE</pubplace>
              <publish>U.S. Department of Agriculture - Natural Resources Conservation Service</publish>
            </pubinfo>
            <onlink>NA, Accessed via personal request, given external hard drive copy.</onlink>
          </citeinfo>
        </srccite>
        <typesrc>spatial relational database</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20170717</begdate>
              <enddate>20170718</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>NASIS</srccitea>
        <srccontr>These data were used to help training new soil map models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Robert J. Hijmans, Susan E. Cameron, Juan L. Parra, Peter G. Jones, and Andy Jarvis</origin>
            <pubdate>2017</pubdate>
            <title>Historical climate data: Bioclimatic variables, 30 seconds</title>
            <geoform>Raster data (tif)</geoform>
            <pubinfo>
              <pubplace>WorldClim Online</pubplace>
              <publish>WorldClim</publish>
            </pubinfo>
            <onlink>https://www.worldclim.org/</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>19700101</begdate>
              <enddate>20001231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>BIO2, BIO8 BIO9, BIO10, BIO11, BIO13, BIO14, BIO15, BIO18, BIO17, BIO19</srccitea>
        <srccontr>Used as independent variable for models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Wilson AM, Jetz W</origin>
            <pubdate>2016</pubdate>
            <title>Global 1-km Cloud Cover</title>
            <geoform>Raster data (tif)</geoform>
            <pubinfo>
              <pubplace>EarthEnv Online</pubplace>
              <publish>EarthEnv</publish>
            </pubinfo>
            <onlink>http://www.earthenv.org/cloud</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20160101</begdate>
              <enddate>20161231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>MODIS spatio-temporal cloud cover satellite images: Mean annual, January mean, February mean, March mean, April mean, May mean, June mean, July mean, August mean, September mean, October mean, November mean, December mean, and Seasonality concentration.</srccitea>
        <srccontr>Used to make independent variables for soil models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Wan, Z., Hook, S., and Hulley, G.</origin>
            <pubdate>2015</pubdate>
            <title>MOD11A2 MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1km SIN Grid V006</title>
            <geoform>raster geotiff</geoform>
            <pubinfo>
              <pubplace>Sioux Falls, SD</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5067/MODIS/MOD11A2.006</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20000101</begdate>
              <enddate>20151231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>MODIS Land Surface Temperature and Emissivity (LST&amp;E)</srccitea>
        <srccontr>Used to make independent variables for soil models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Kamel Didan</origin>
            <pubdate>2015</pubdate>
            <title>MOD13Q1 MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V006</title>
            <geoform>raster geotiff</geoform>
            <pubinfo>
              <pubplace>Sioux Falls, SD</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5067/MODIS/MOD13Q1.006</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20000101</begdate>
              <enddate>20151231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>Terra Moderate Resolution Imaging Spectroradiometer (MODIS) Vegetation Indices (MOD13Q1) Version 6</srccitea>
        <srccontr>Used to develop independent variables for soil models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Kuchler, A.W., Conservation Biology Institute</origin>
            <pubdate>2012</pubdate>
            <title>U.S. Potential Natural Vegetation, Original Kuchler Types, v2.0 (Spatially Adjusted to Correct Geometric Distortions)</title>
            <geoform>Vector data</geoform>
            <pubinfo>
              <pubplace>Data Basin</pubplace>
              <publish>Data Basin</publish>
            </pubinfo>
            <onlink>https://databasin.org/datasets/1c7a301c8e6843f2b4fe63fdb3a9fe39</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital vector data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20120101</begdate>
              <enddate>20121231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>Potential Natural Vegetation</srccitea>
        <srccontr>Used as independent variable for soil models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2016</pubdate>
            <title>USGS National Hydrography Dataset (NHD) Downloadable Data Collection - National Geospatial Data Asset (NGDA) National Hydrography Dataset (NHD) (https://www.sciencebase.gov/catalog/item/4f5545cce4b018de15819ca9)</title>
            <geoform>Vector data</geoform>
            <pubinfo>
              <pubplace>Rolla, MO and Denver, CO</pubplace>
              <publish>USGS - National Geospatial Technical Operations Center (NGTOC)</publish>
            </pubinfo>
            <onlink>http://prd-tnm.s3-website-us-west-2.amazonaws.com/?prefix=StagedProducts/Hydrography/NHD/National/HighResolution/GDB/</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital vector data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20010101</begdate>
              <enddate>20161231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>NHD Entire Nation, Vertical distance to channel network</srccitea>
        <srccontr>Used to calculate a vertical distance to channel network for independent variable in soil models.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2014</pubdate>
            <title>National Atlas: 100-Meter Resolution Elevation of the Conterminous United States (Entity ID: NT00828)</title>
            <geoform>Raster geotiff</geoform>
            <pubinfo>
              <pubplace>Sioux Falls, SD</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://www.usgs.gov/centers/eros/science/usgs-eros-archive-digital-maps-national-atlas?qt-science_center_objects=0#qt-science_center_objects</onlink>
          </citeinfo>
        </srccite>
        <typesrc>digital raster data</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20140101</begdate>
              <enddate>20141231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>ground condition</srccurr>
        </srctime>
        <srccitea>100 meter DEM</srccitea>
        <srccontr>Used to develop independent variables for soil models.</srccontr>
      </srcinfo>
      <procstep>
        <procdesc>Data processing to create the final soil pH raster datasets: The soil pH maps and validation plots were created with random forest models relating field soil samples to a set of raster environmental covariates representing soil forming factors. Soil data were from the National Soil Characterization Database (SCD) laboratory dataset (https://ncsslabdatamart.sc.egov.usda.gov/, accessed 07/2017), the USDA-NRCS national soil information system (NASIS) morphology profiles (accessed 2017), and the 2018 version of the United States gridded soil survey geographic (SSURGO) database. The NASIS profiles were used to spatially query SSSURGO pH estimates and then combined with SCD data as training data to regress against environmental rasters. The maps and validation plots provided result from the random forest models for each depth based on these data sources. The scripts that link NASIS profiles to SSURGO are at: https://github.com/usgs/Predictive-Soil-Mapping/tree/master/SoilSurvReconstrProperties/NASISprep. The environmental rasters used as predictors are documented at: https://github.com/usgs/Predictive-Soil-Mapping/tree/master/SoilSurvReconstrProperties/covariates. The R script developing the models is at: https://github.com/usgs/Predictive-Soil-Mapping/blob/master/SoilSurvReconstrProperties/pH/QuantRFmodel_LmAdj_2D_with_pt_extract_Opti_parallel_PI_CV_ph_h2o_SS_NASIS_depthloop_updated.scdastrain.R.</procdesc>
        <procdate>2019</procdate>
      </procstep>
      <procstep>
        <procdesc>Data Quality Assessment and Quality Control (QAQC): All data used for these models was originally collected by other entites and then incorporated into these models. Thus there may be errors associated with these data sources not fully reflected in these models. In order to evaluate the models themselves, cross validation was used to create robust estimates of error. Cross validation error estimates are reported in the peer reviewed manuscript and online at https://github.com/usgs/Predictive-Soil-Mapping/blob/master/SoilSurvReconstrProperties/PerformanceSummarybyModel_repository.xlsx. Additionally, uncertainty estimates for all predictions are included for all models by reporting 95% prediction intervals and derived relative prediction intervals. The cross validation steps are included in the R scripts linked to each folder. All processing steps and validation QA/QC were documented in R script on the github repository which was compiled in early 2019.</procdesc>
        <procdate>2019</procdate>
      </procstep>
      <procstep>
        <procdesc>Finalize Data for Dissemination: Data sent to the Southwest Biological Science Center Data Steward for dissemination and preservation per USGS Data Management Policies SM 502.6, SM 502.7, SM 502.8 and SM 502.9 (1 October 2016).</procdesc>
        <procdate>2020</procdate>
      </procstep>
    </lineage>
  </dataqual>
  <spdoinfo>
    <direct>Raster</direct>
  </spdoinfo>
  <spref>
    <horizsys>
      <planar>
        <mapproj>
          <mapprojn>Albers Conical Equal Area</mapprojn>
          <albers>
            <stdparll>29.5</stdparll>
            <stdparll>45.5</stdparll>
            <longcm>-96.0</longcm>
            <latprjo>23.0</latprjo>
            <feast>0.0</feast>
            <fnorth>0.0</fnorth>
          </albers>
        </mapproj>
        <planci>
          <plance>row and column</plance>
          <coordrep>
            <absres>1.0</absres>
            <ordres>1.0</ordres>
          </coordrep>
          <plandu>METERS</plandu>
        </planci>
      </planar>
      <geodetic>
        <horizdn>D_GRS_1980</horizdn>
        <ellips>GRS_1980</ellips>
        <semiaxis>6378137.0</semiaxis>
        <denflat>298.257222101</denflat>
      </geodetic>
    </horizsys>
  </spref>
  <eainfo>
    <detailed>
      <enttyp>
        <enttypl>Soil pH raster data (28)</enttypl>
        <enttypd>These data include maps and associated validation plots representing soil pH (1:1 method) as defined by the United States national cooperative soil survey program. Files include predictions of pH, uncertainty, and 10-fold cross validation (CV) 1:1 density plots. Predictions were made at 0, 5, 15, 30, 60, 100, and 200 cm depths and are referenced within each file name (e.g. 'ph1to1h2o_r_0_cm_2D_QRF.tif' is for the 0 cm depth). Prediction maps have filenames that end with '_QRF.tif'. Prediction uncertainty maps end with '_95PI_h.tif' for the upper 95% prediction interval (PI) bound, '_95PI_l.tif' for the low bound, and '95PI_relwidth.tif' for the relative prediction interval (RPI). The RPI is calculated as the ratio of the 95% PI width divided by the 95% interquantile width of the original training sample. Values approaching and above one indicated high uncertainty. Values below 0.5 indicate low uncertainty (i.e. less likelihood of error). The file ending in 'CV_plots.tif' is a trellis of validation graphs for all training data used in the model with cross validation predictions on the y-axis and original measured values on the x-axis. Values in CV plots closer to the black 1:1 line mean that predictions were more accurate. The file ending in '_CV_SCD_plots.tif' is also a CV plot, but only including portions of the training data that were directly measured in a laboratory (i.e. higher quality data). There is also ancillary geographic information system (GIS) files associated with many of the GeoTiff files that include extensions '.tfw' and '.tif.aux.xml' that are not necessary for viewing files, but will facilitate easier viewing in some GIS software.</enttypd>
        <enttypds>Producer defined</enttypds>
      </enttyp>
      <attr>
        <attrlabl>Raster Values</attrlabl>
        <attrdef>A modeled quantity that was determined by calculation.</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <udom>Data users should read this metadata record, especially each process step which provides details and describes the raster data layers associated attributes, including attribute values.</udom>
        </attrdomv>
      </attr>
    </detailed>
  </eainfo>
  <distinfo>
    <distrib>
      <cntinfo>
        <cntperp>
          <cntper>U.S. Geological Survey - ScienceBase</cntper>
          <cntorg>U.S. Geological Survey</cntorg>
        </cntperp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>Denver Federal Center, Building 810, Mail Stop 302</address>
          <city>Denver</city>
          <state>CO</state>
          <postal>80225</postal>
          <country>United States</country>
        </cntaddr>
        <cntvoice>1-888-275-8747</cntvoice>
        <cntemail>sciencebase@usgs.gov</cntemail>
      </cntinfo>
    </distrib>
    <distliab>The author(s) of these data request that data users contact them regarding intended use and to assist with understanding limitations and interpretation. Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data for other purposes, nor on all computer systems, nor shall the act of distribution constitute any such warranty.</distliab>
    <techpreq>This zip file contains data available in raster tif format. The user must have software capable of uncompressing the zip file, and displaying the raster data sets.</techpreq>
  </distinfo>
  <metainfo>
    <metd>20200827</metd>
    <metc>
      <cntinfo>
        <cntperp>
          <cntper>Travis W Nauman</cntper>
          <cntorg>U.S. Geological Survey</cntorg>
        </cntperp>
        <cntpos>Soil Scientist</cntpos>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>2290 S West Resource Blvd</address>
          <city>Moab</city>
          <state>UT</state>
          <postal>84532</postal>
          <country>US</country>
        </cntaddr>
        <cntvoice>928-556-7537</cntvoice>
        <cntemail>tnauman@usgs.gov</cntemail>
      </cntinfo>
    </metc>
    <metstdn>Content Standard for Digital Geospatial Metadata</metstdn>
    <metstdv>FGDC-STD-001-1998</metstdv>
  </metainfo>
</metadata>
