<?xml version='1.0' encoding='UTF-8'?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <idinfo>
    <citation>
      <citeinfo>
        <origin>Michael E. Wieczorek</origin>
        <origin>Shannon E. Jackson</origin>
        <origin>Gregory E. Schwarz</origin>
        <pubdate>20230802</pubdate>
        <title>Select Hydrologic Modification Attributes: Major Sites of the National Pollutant Discharge Elimination System (NPDES)</title>
        <geoform>tabular digital data</geoform>
        <pubinfo>
          <pubplace>Reston, Virginia</pubplace>
          <publish>U.S. Geological Survey</publish>
        </pubinfo>
        <onlink>https://doi.org/10.5066/F7765D7V</onlink>
      </citeinfo>
    </citation>
    <descript>
      <abstract>This tabular data set represents the number and density of major National Pollutant Discharge Elimination System (NPDES) sites, as defined by Environmental Protection Agency (EPA), compiled for two spatial components of the NHDPlus version 2 data suite (NHDPlusv2) for the conterminous United States; 1) number and desnity of major sites per individual reach catchments and 2) number and density of major sites accumulated upstream through the river network. This dataset can be linked to the NHDPlus version 2 data suite by the unique identifier COMID.  The source data was acquired by James Falcone (USGS, written commun., 2010) from the EPA. Reach catchment information characterizes data at the local scale. Reach catchments accumulated upstream through the river network characterizes cumulative upstream conditions.  Network-accumulated values are computed using two methods, 1) divergence-routed and 2) total cumulative drainage area. Both approaches use a modified routing database to navigate the NHDPlus reach network to aggregate (accumulate) the metrics derived from the reach catchment scale. (Schwarz and Wieczorek, 2018).</abstract>
      <purpose>This data set was created by the U.S. Geological Survey's (USGS) National Water-Quality Assessment Project (NAWQA) which is part of the USGS National Water Quality Program (NWQP).  This effort was undertaken to estimate the number of major NDPES sites for NHDPlusV2 flowline catchments and upstream river networks to support statistical analysis, map display, and model parameterization.</purpose>
      <supplinf>The data processed here uses a modified routing data base for the NHDPlusV2 flowline network (ENHDPlusV2_us, Schwarz and Wieczorek, 2018). The NHDPlusV2 flowline network serves as the spatial infrastructure for many water-quality modeling efforts included under NAWQA Cycle 3 status and trend assessments including the SPARROW (SPAtially Referenced Regressions On Watershed attributes) model (Rowe and others, 2009). The NHDPlusV2 flowline network is a publically available digital geospatial framework (database) that depicts the network of stream segments and their catchments within the conterminous United States (see Moore and Dewald, 2016 for more detail). The NHDPlusv2 data set includes catchments for each reach covering the conterminous U.S. which includes the catchment boundaries in neighboring Canada and Mexico. These catchments are used as the spatial units for zonal statistics. A statistic (summation, average, minimum or maximum) is calculated for each zone (catchment), based on values from the source ancillary data set (See Process_Description for more details). The output of these values are used as the source input data for an accumulation program which uses a routing database to allocate and accumulate landscape metrics. This database  (ENHDPlusV2_us) is based on a topologically reconditionedversion of the NHDPlusV2 hydrography network and is used for routing purposes only. No cartographic changes were made to the original NHDPlusV2 in either the flowline or catchment line work.</supplinf>
    </descript>
    <timeperd>
      <timeinfo>
        <sngdate>
          <caldate>2006</caldate>
          <time>unknown</time>
        </sngdate>
      </timeinfo>
      <current>acquisition date</current>
    </timeperd>
    <status>
      <progress>Complete</progress>
      <update>As needed</update>
    </status>
    <spdom>
      <bounding>
        <westbc>-127.910792</westbc>
        <eastbc>-65.327751</eastbc>
        <northbc>51.657387</northbc>
        <southbc>23.243486</southbc>
      </bounding>
      <descgeog>CONUS United States</descgeog>
    </spdom>
    <keywords>
      <theme>
        <themekt>None</themekt>
        <themekey>National Pollutant Discharge Elimination System</themekey>
        <themekey>SPARROW</themekey>
        <themekey>NPDES</themekey>
        <themekey>NHDPlus</themekey>
        <themekey>Catchment</themekey>
        <themekey>NAWQA</themekey>
      </theme>
      <theme>
        <themekt>USGS Metadata Identifier</themekt>
        <themekey>USGS:57c9d89ce4b0f2f0cec192da</themekey>
      </theme>
      <place>
        <placekt>None</placekt>
        <placekey>Conterminous United States</placekey>
      </place>
    </keywords>
    <accconst>none</accconst>
    <useconst>One should use caution and read this document before using these data to estimate smaller areas. These data should not be used for site-specific evaluation, surveying, or engineering purposes. They should also not be used beyond the limits of the source data's scale. Use of this data is considered acceptance of the limitations of this data and that the user has read and understood this metadata prior to its use in any form.</useconst>
    <ptcontac>
      <cntinfo>
        <cntperp>
          <cntper>Michael E. Wieczorek</cntper>
          <cntorg>US Geological Survey, Maryland Water Science Center</cntorg>
        </cntperp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>5522 Research Park Drive</address>
          <city>Baltimore</city>
          <state>MD</state>
          <postal>21228</postal>
          <country>USA</country>
        </cntaddr>
        <cntvoice>443-498-5550</cntvoice>
        <cntemail>mewieczo@usgs.gov</cntemail>
      </cntinfo>
    </ptcontac>
    <datacred>We would like to acknowledge the following USGS employees.  The original conception and rationale for creating these data is from Robert Gilliom and Daren Carlisle.  We also acknowledge Andrew LaMotte in helping to process and format much of this data as well as the helpful colleague reviews provided by Alison Appling and Daren Carlisle.  We also would like to thank Dave Wolock, Curtis Price, Roland Viger, and Drew Ignazio for their invaluable input.</datacred>
    <crossref>
      <citeinfo>
        <origin>Gary L. Rowe</origin>
        <origin>Kenneth Belitz</origin>
        <origin>Hedeff I. Essaid</origin>
        <origin>Robert J. Gilliom</origin>
        <origin>Pixie A. Hamilton</origin>
        <origin>Anne B. Hoos</origin>
        <origin>Dennis D. Lynch</origin>
        <origin>Mark D. Munn</origin>
        <origin>David W. Wolock</origin>
        <pubdate>2009</pubdate>
        <title>Design of Cycle 3 of the National Water-Quality Assessment Program, 2013–2023: Part 1: Framework of Water-Quality Issues and Potential Approaches</title>
        <pubinfo>
          <pubplace>Reston, Virginia</pubplace>
          <publish>U.S. Geological Survey</publish>
        </pubinfo>
        <onlink>http://pubs.usgs.gov/of/2009/1296/pdf/OF09-1296.pdf</onlink>
      </citeinfo>
    </crossref>
    <crossref>
      <citeinfo>
        <origin>Richard B. Moore</origin>
        <origin>Thomas G. Dewald</origin>
        <pubdate>2016</pubdate>
        <title>The Road to NHDPlus-Advancements in Digital Stream Networks and Associated Catchments</title>
        <pubinfo>
          <pubplace>American Water Resources Association</pubplace>
          <publish>(JAWRA) 1-10.DOI: 10.1111/1752-1688.12389</publish>
        </pubinfo>
        <onlink>http://dx.doi.org/10.1111/1752-1688.12389</onlink>
      </citeinfo>
    </crossref>
    <crossref>
      <citeinfo>
        <origin>James A. Falcone</origin>
        <origin>Daren M. Carlisle</origin>
        <origin>David M. Wolock</origin>
        <origin>Michael R. Meador</origin>
        <pubdate>2010</pubdate>
        <title>GAGESII: a stream gage database for evaluating natural and altered flow conditions in the conterminous United States</title>
        <pubinfo>
          <pubplace>Ecology 91 (2), 621</pubplace>
          <publish>Data Paper in Ecological Archives E091-045-D1</publish>
        </pubinfo>
        <onlink>http://esapubs.org/Archive/ecol/E091/045/metadata.htm</onlink>
      </citeinfo>
    </crossref>
  </idinfo>
  <dataqual>
    <attracc>
      <attraccr>Basin metrics were compared to values for the same variables available in the GAGESII data set (Falcone, 2010).  The following steps were performed before making the comparisons:

1) GAGESII point file with the most current COMIDs associations was joined to the routing database file which contains accumulated basin areas (Schwarz and Wieczorek, 2018).
2) Only those GAGESII points that have a basin area within 1% or less of the accumulated basin areas were processed and used for comparisons. This resulted in 4,713 locations.
3) Using those COMIDs for GAGESII sites, the associated record in the accumulation files were pulled and used in the comparison of one to one plots.

Results demonstrated a Pearson's r-squared of over 99%.  Outliers can be attributed to either the difference in the basin delineations of GAGESII basins compared to that of the catchment representations in NHDPlus version 2 or to the fact that this data set only reports data from the downstream end of reaches and their upstream contribution. GAGESII basin outlets can occur anywhere along a reach segment which can result in different results depending on the variability of the source data being compiled and the distance between the GAGES II basin outlet and the downstream endpoint of the reach segment. NHDPlus version 2 reach segments can be as long as 95 kilometers and some source data can be widely variable within an NHDPlus reach catchment. As a result, some data processed from the location of a GAGESII pour point compared to the downstream end of a reach segment can result in differences in accumulated variables.</attraccr>
    </attracc>
    <logic>None</logic>
    <complete>None</complete>
    <posacc>
      <horizpa>
        <horizpar>No formal positional accuracy tests were conducted</horizpar>
      </horizpa>
      <vertacc>
        <vertaccr>No formal positional accuracy tests were conducted</vertaccr>
      </vertacc>
    </posacc>
    <lineage>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Gregory E. Schwarz</origin>
            <origin>Michael E. Wieczorek</origin>
            <pubdate>2018</pubdate>
            <title>Database of Modified Routing for NHDPlus Version 2.1 Flowlines: ENHDPlusV2_US</title>
            <geoform>tabular data</geoform>
            <pubinfo>
              <pubplace>Reston, VA</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://www.sciencebase.gov/catalog/item/5718d2f8e4b0ef3b7cabdc17</onlink>
          </citeinfo>
        </srccite>
        <typesrc>online</typesrc>
        <srctime>
          <timeinfo>
            <sngdate>
              <caldate>2018</caldate>
            </sngdate>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>nhdplusus_v2.sas</srccitea>
        <srccontr>The source data provided routing information for the NHDPlusV2 flow line network.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Environmental Protection</origin>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2012</pubdate>
            <title>National Hydrography Dataset Plus Version 2.1</title>
            <geoform>vector digital data</geoform>
            <pubinfo>
              <pubplace>Reston, VA</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>http://www.horizon-systems.com/NHDPlus/NHDPlusV2_data.php</onlink>
          </citeinfo>
        </srccite>
        <typesrc>online</typesrc>
        <srctime>
          <timeinfo>
            <sngdate>
              <caldate>2012</caldate>
            </sngdate>
          </timeinfo>
          <srccurr>publication date</srccurr>
        </srctime>
        <srccitea>NHDPlus</srccitea>
        <srccontr>The source data is the basic hydrologic framework used to process  these data.</srccontr>
      </srcinfo>
      <procstep>
        <procdesc>1) Acquired the source data point file from James Falcone (USGS, 2010).

2) The number of major NPDES sites was calculated using ESRI's "IDENTITY function in python using the point coverage and NHDPlus version 2 reach catchments as input. The "FREQUENCY" function was then used to count the number of NPDES sites for each NHDPlus version 2 reach catchment.

3) An aggregation algorithm was performed on the assigned variables. See the metadata for "Database of Modified Routing for NHDPlus Version 2.1 Flowlines: ENHDPlusV2_US" by Schwarz and Wieczorek in Source_Information for more details on the routing databse.  See the landing page for "Select Attributes for NHDPlus Version 2.1 Reach Catchments and Modified Network Routed Upstream Watersheds for the Conterminous United States" for a copy of the aggregation script and its details.

4) Exported results into a comma separated text file format.</procdesc>
        <procdate>20151221</procdate>
      </procstep>
    </lineage>
  </dataqual>
  <spdoinfo>
    <indspref>Common Identifier Code for each catchment and reach segment (COMID)</indspref>
  </spdoinfo>
  <eainfo>
    <detailed>
      <enttyp>
        <enttypl>NPDES_MAJ_CONUS.txt</enttypl>
        <enttypd>comma separated tabular data</enttypd>
        <enttypds>U.S. Geological Survey</enttypds>
      </enttyp>
      <attr>
        <attrlabl>COMID</attrlabl>
        <attrdef>Unique ID to relate to each NHDPlus version 2 catchment and flow line.</attrdef>
        <attrdefs>http://nhd.usgs.gov/NHDDataDictionary_model2.0.pdf</attrdefs>
        <attrdomv>
          <udom>Unique numbers that are automatically generated.</udom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>CAT_NPDES_MAJ</attrlabl>
        <attrdef>Number of major NPDES sites per NHDPlus version 2 catchment.</attrdef>
        <attrdefs>U.S. Geological Survey</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>10</rdommax>
            <attrunit>Count.</attrunit>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>ACC_NPDES_MAJ</attrlabl>
        <attrdef>Accumulated number of major NPDES sites based on divergence routing.</attrdef>
        <attrdefs>U.S. Geological Survey</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>100</rdommax>
            <attrunit>Count.</attrunit>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>TOT_NPDES_MAJ</attrlabl>
        <attrdef>Accumulated number of major NPDES sites based on total upstream accumulation.</attrdef>
        <attrdefs>U.S. Geological Survey</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>2263</rdommax>
            <attrunit>Count.</attrunit>
          </rdom>
        </attrdomv>
      </attr>
    </detailed>
    <detailed>
      <enttyp>
        <enttypl>NPDES_DENSITY_CONUS.txt</enttypl>
        <enttypd>comma separated tabular data</enttypd>
        <enttypds>U.S. Geological Survey</enttypds>
      </enttyp>
      <attr>
        <attrlabl>COMID</attrlabl>
        <attrdef>Unique ID to relate to each NHDPlus version 2 catchment and flow line.</attrdef>
        <attrdefs>http://nhd.usgs.gov/NHDDataDictionary_model2.0.pdf</attrdefs>
        <attrdomv>
          <udom>Unique numbers that are automatically generated.</udom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>CAT_NPDES_MAJ_DENS</attrlabl>
        <attrdef>Density of major NPDES sites per NHDPlus version 2 catchment.</attrdef>
        <attrdefs>U.S. Geological Survey</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>27777.78</rdommax>
            <attrunit>sites per 100 square kilometers.</attrunit>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>ACC_NPDES_MAJ</attrlabl>
        <attrdef>Accumulated density of major NPDES sites based on divergence routing. -9999 denotes flow line reach is disconnected from the network and cannot be accumulated.</attrdef>
        <attrdefs>U.S. Geological Survey</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>5555.56</rdommax>
            <attrunit>sites per 100 square kilometers.</attrunit>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>TOT_NPDES_MAJ</attrlabl>
        <attrdef>Accumulated density of major NPDES sites based on total upstream accumulation. -9999 denotes flow line reach is disconnected from the network and cannot be accumulated.</attrdef>
        <attrdefs>U.S. Geological Survey</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>5555.56</rdommax>
            <attrunit>sites per 100 square kilometers.</attrunit>
          </rdom>
        </attrdomv>
      </attr>
    </detailed>
  </eainfo>
  <distinfo>
    <distrib>
      <cntinfo>
        <cntperp>
          <cntper>Michael E. Wieczorek</cntper>
          <cntorg>U.S. Geological Survey - ScienceBase</cntorg>
        </cntperp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>Denver Federal Center, Building 810, Mail Stop 302</address>
          <city>Denver</city>
          <state>CO</state>
          <postal>80225</postal>
          <country>USA</country>
        </cntaddr>
        <cntvoice>443-498-5550</cntvoice>
        <cntemail>mewieczo@usgs.gov</cntemail>
      </cntinfo>
    </distrib>
    <distliab>Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Although this database has been subjected to rigorous review and is substantially complete, the USGS reserves the right to revise the data pursuant to further analysis and review. Furthermore, it is released on condition that neither the USGS nor the U.S. Government may be held liable for any damages resulting from its authorized or unauthorized use.
Although these data have been processed successfully on a computer system at the U.S. Geological Survey, no warranty expressed or implied is made regarding the display or utility of the data on any other system, or for general or scientific purposes, nor shall the act of distribution constitute any such warranty.  The U.S. Geological Survey shall not be held liable for improper or incorrect use of the data described and/or contained herein.</distliab>
  </distinfo>
  <metainfo>
    <metd>20260326</metd>
    <metc>
      <cntinfo>
        <cntperp>
          <cntper>Michael Wieczorek</cntper>
          <cntorg>US Geological Survey</cntorg>
        </cntperp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>5522 Research Park Drive</address>
          <city>Baltimore</city>
          <state>Maryland</state>
          <postal>21228</postal>
          <country>USA</country>
        </cntaddr>
        <cntvoice>443-498-5550</cntvoice>
        <cntemail>mewieczo@usgs.gov</cntemail>
      </cntinfo>
    </metc>
    <metstdn>FGDC Content Standard for Digital Geospatial Metadata</metstdn>
    <metstdv>FGDC-STD-001-1998</metstdv>
  </metainfo>
</metadata>
