<?xml version='1.0' encoding='UTF-8'?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <idinfo>
    <citation>
      <citeinfo>
        <origin>Scott W. Anderson</origin>
        <origin>Chris P. Konrad</origin>
        <pubdate>20260219</pubdate>
        <title>Supporting data and model archives for physically-informed models of streamflow correlation for the contiguous United States, 1981-2020</title>
        <geoform>publication</geoform>
        <onlink>https://doi.org/10.5066/P94WBRU1</onlink>
      </citeinfo>
    </citation>
    <descript>
      <abstract>This data release provides an archive of random forest regressions predicting streamflow correlation as a function of climatic similarity over the contiguous United States, based on streamflow information from water years 1981 to 2020. This data release also contains the underlying datasets used to generate those regressions and a series of scripts that facilitate their exploration and use. The scripts expect a specific file and folder structure, which is described in the "streamflow_correlation_expected_folder_structure.txt' file.  

rf_models.zip - contains random forest models for all 135 model nodes, stored as R (.rds) objects readable using the 'readRDS' function in R. Also includes a csv defining the training and validation gage-pairs used for each model node, and a csv giving the coordinates of the central XY point of each model node. 

scripts.zip - R scripts designed to guide access and use of the supplied models. Includes an example analysis. Scripts require the datasets stored in the 'supporting datasets' child item arranged in a specific folder structure.

streamflow_correlation_expected_folder_structure.text - simple text file laying out expected folder/file structure for scripts to function as written. 

Model input data are all stored in a child item of this data release ('Supporting datasets for physically-informed models of streamflow correlation'). These data are required for the scripts presented here to run properly.</abstract>
      <purpose>These data were used to develop statistical/empirical models predicting streamflow correlation between U.S. Geological Survey National Hydrography Dataset (NHD) stream reaches as a function of watershed similarity metrics.</purpose>
    </descript>
    <timeperd>
      <timeinfo>
        <rngdates>
          <begdate>19801001</begdate>
          <enddate>20200930</enddate>
        </rngdates>
      </timeinfo>
      <current>observed</current>
    </timeperd>
    <status>
      <progress>Complete</progress>
      <update>None planned</update>
    </status>
    <spdom>
      <bounding>
        <westbc>-129.9023</westbc>
        <eastbc>-66.0938</eastbc>
        <northbc>49.9512</northbc>
        <southbc>24.5271</southbc>
      </bounding>
    </spdom>
    <keywords>
      <theme>
        <themekt>ISO 19115 Topic Category</themekt>
        <themekey>environment</themekey>
        <themekey>inlandWaters</themekey>
      </theme>
      <theme>
        <themekt>USGS Thesaurus</themekt>
        <themekey>hydrology</themekey>
        <themekey>regression analysis</themekey>
        <themekey>stream discharge</themekey>
        <themekey>streamflow</themekey>
      </theme>
      <theme>
        <themekt>USGS Metadata Identifier</themekt>
        <themekey>USGS:6476765bd34e4e58932da742</themekey>
      </theme>
      <place>
        <placekt>Common geographic areas</placekt>
        <placekey>United States</placekey>
      </place>
    </keywords>
    <accconst>None.  Please see 'Distribution Info' for details.</accconst>
    <useconst>None.  Users are advised to read the dataset's metadata thoroughly to understand appropriate use and data limitations.</useconst>
    <ptcontac>
      <cntinfo>
        <cntperp>
          <cntper>Scott W Anderson</cntper>
          <cntorg>U.S. Geological Survey, NW-PACIFIC ISLAND REG</cntorg>
        </cntperp>
        <cntpos>Hydrologist</cntpos>
        <cntaddr>
          <addrtype>mailing address</addrtype>
          <address>934 Broadway</address>
          <city>Tacoma</city>
          <state>WA</state>
          <postal>98402</postal>
          <country>US</country>
        </cntaddr>
        <cntvoice>253-552-1633</cntvoice>
        <cntfax>253-552-1580</cntfax>
        <cntemail>swanderson@usgs.gov</cntemail>
      </cntinfo>
    </ptcontac>
    <datacred>This work was funded by the Groundwater and Streamflow Information Program, U.S. Geological Survey.</datacred>
    <native>All processing was done in the R statistical computing language, version 4.0.3. Random forest models were fit using the randomForest R package, version 4.6-14.</native>
  </idinfo>
  <dataqual>
    <attracc>
      <attraccr>NA</attraccr>
    </attracc>
    <logic>NA</logic>
    <complete>NA</complete>
    <posacc>
      <horizpa>
        <horizpar>NA</horizpar>
      </horizpa>
      <vertacc>
        <vertaccr>NA</vertaccr>
      </vertacc>
    </posacc>
    <lineage>
      <procstep>
        <procdesc>See Anderson and Konrad (in prep)</procdesc>
        <procdate>20230601</procdate>
      </procstep>
    </lineage>
  </dataqual>
  <spref>
    <horizsys>
      <planar>
        <mapproj>
          <mapprojn>Albers Conical Equal Area</mapprojn>
          <transmer>
            <sfctrmer>1</sfctrmer>
            <longcm>-96</longcm>
            <latprjo>23</latprjo>
            <feast>0</feast>
            <fnorth>0</fnorth>
          </transmer>
        </mapproj>
        <planci>
          <plance>coordinate pair</plance>
          <coordrep>
            <absres>0.01</absres>
            <ordres>0.01</ordres>
          </coordrep>
          <plandu>meters</plandu>
        </planci>
      </planar>
      <geodetic>
        <horizdn>North American Datum of 1983 (NAD 83)</horizdn>
        <ellips>Geodetic Reference System 1980</ellips>
        <semiaxis>6378137.000000</semiaxis>
        <denflat>298.257222</denflat>
      </geodetic>
    </horizsys>
  </spref>
  <eainfo>
    <overview>
      <eaover>rf_models.zip - contains 135 R (.rds) files with naming template rf_clim_cor_X.rds, where X indicates a unique numeric identifier for a given node. Each .rds file contains an unpadded version of the model (rf_model_logdv) and a padded version (rf_model_logdv_padded) as well as the coordinates defining the spatial center of the model node, in Albers Equal Area coodinates (model_center_x and model_center_y).</eaover>
      <eadetcit>Anderson, S.W. and Konrad, C.P., 2026, Supporting data and model archives for physically-informed models of streamflow correlation for the contiguous United States, 1981-2020, U.S. Geological Survey data release, https://doi.org/10.5066/P94WBRU1.</eadetcit>
    </overview>
  </eainfo>
  <distinfo>
    <distrib>
      <cntinfo>
        <cntperp>
          <cntper>GS ScienceBase</cntper>
          <cntorg>U.S. Geological Survey</cntorg>
        </cntperp>
        <cntaddr>
          <addrtype>mailing address</addrtype>
          <address>Denver Federal Center, Building 810, Mail Stop 302</address>
          <city>Denver</city>
          <state>CO</state>
          <postal>80225</postal>
          <country>United States</country>
        </cntaddr>
        <cntvoice>1-888-275-8747</cntvoice>
        <cntemail>sciencebase@usgs.gov</cntemail>
      </cntinfo>
    </distrib>
    <distliab>Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data for other purposes, nor on all computer systems, nor shall the act of distribution constitute any such warranty.</distliab>
    <stdorder>
      <digform>
        <digtinfo>
          <formname>Digital Data</formname>
        </digtinfo>
        <digtopt>
          <onlinopt>
            <computer>
              <networka>
                <networkr>https://doi.org/10.5066/P94WBRU1</networkr>
              </networka>
            </computer>
          </onlinopt>
        </digtopt>
      </digform>
      <fees>None</fees>
    </stdorder>
  </distinfo>
  <metainfo>
    <metd>20260219</metd>
    <metc>
      <cntinfo>
        <cntperp>
          <cntper>Scott W Anderson</cntper>
          <cntorg>U.S. Geological Survey, NW-PACIFIC ISLAND REG</cntorg>
        </cntperp>
        <cntpos>Hydrologist</cntpos>
        <cntaddr>
          <addrtype>mailing address</addrtype>
          <address>934 Broadway</address>
          <city>Tacoma</city>
          <state>WA</state>
          <postal>98402</postal>
          <country>US</country>
        </cntaddr>
        <cntvoice>253-552-1633</cntvoice>
        <cntfax>253-552-1580</cntfax>
        <cntemail>swanderson@usgs.gov</cntemail>
      </cntinfo>
    </metc>
    <metstdn>FGDC Content Standard for Digital Geospatial Metadata</metstdn>
    <metstdv>FGDC-STD-001-1998</metstdv>
  </metainfo>
</metadata>
