Amy M. Russell
Thomas M. Over
William H. Farmer
20200728
Cross-validation results for five statistical methods of daily streamflow estimation at 1,385 reference streamgages in the conterminous United States, Water Years 1981-2017
Tabular digital data
Urbana, IL
U.S. Geological Survey
https://doi.org/10.5066/P9XT4WSP
This data release contains daily time series estimates of natural streamflow for 1,385 streamgages in 19 study regions in the conterminous U.S. from October 1, 1980, through September 30, 2017. These estimates are provided for gages from mostly undisturbed watersheds as defined by Falcone (2011), using five statistical techniques: nearest-neighbor drainage area ratio (NNDAR), map-correlation drainage area ratio (MCDAR), nearest-neighbor nonlinear spatial interpolation using flow duration curves (NNQPPQ), map-correlation nonlinear spatial interpolation using flow duration curves (MCQPPQ), and ordinary kriging of the logarithms of discharge per unit area (OKDAR). Location information and basin characteristics for study gages were obtained from the "Reference" gages of the GAGES-II dataset (Falcone, 2011, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011). Observed daily streamflow data were retrieved from the National Water Information System (NWIS) on September 7, 2018. NNDAR, MCDAR, NNQPPQ, and MCQPPQ estimates were computed following methods described by Farmer and others (2014), with updates to the flow-duration curve modeling which is described by Over and others (2018). OKDAR estimates were computed using pooled variograms for each study region following methods described by Farmer (2016). Daily streamflow estimation was conducted in a leave-one-out-cross-validation approach where each streamgage was treated as if ungaged and all the remaining streamgages in a study region were used to calibrate each method and perform estimations at the "ungaged" site. The observed streamflow records were compared to the five simulated streamflow records to help assess performance of each method. These performance metrics are provided at each gage for all five statistical methods.
References cited:
Falcone, J.A., 2011, GAGES-II: Geospatial Attributes of Gages for Evaluating Streamflow [digital spatial dataset] : U.S. Geological Survey Water Resources NSDI Node web page, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011.
Farmer, W.H., Archfield, S.A., Over, T.M., Hay, L.E., LaFontaine, J.H., and Kiang, J.E., 2014, A comparison of methods to predict historical daily streamflow time series in the southeastern United States: U.S. Geological Survey Scientific Investigations Report 2014–5231, 34 p., http://dx.doi.org/10.3133/sir20145231.
Farmer, W. H., 2016, Ordinary kriging as a tool to estimate historical daily streamflow records, Hydrology and Earth System Sciences, 20, 2721-2735, https://doi.org/10.5194/hess-20-2721-2016.
Over, T.M., Farmer, W.H., Russell, A.M., 2018, Refinement of a regression-based method for prediction of flow-duration curves of daily streamflow in the conterminous United States; U.S. Geological Survey Scientific Investigations Report 2018–5072, https://doi.org/10.3133/sir20185072.
The purpose of this data release is to inform hydrologic characterization at ungaged locations.
19801001
20170930
ground condition
None planned
-125.59570312358
-65.478515625979
49.353470715064
24.166402900476
ISO 19115 Topic Category
inlandWaters
USGS Thesaurus
streamflow
statistical analysis
None
reference hydrology
USGS Metadata Identifier
USGS:5df2c29ee4b02caea0f95b92
None
Conterminous United States
none
none
Amy M Russell
CENTRAL MIDWEST WATER SCIENCE CENTER
Hydrologist
mailing and physical
405 N. Goodwin Avenue
Urbana
IL
61801
United States
217-328-9773
arussell@usgs.gov
No formal attribute accuracy tests were conducted
No formal logical accuracy tests were conducted
Data set is considered complete for the information presented, as described in the abstract. Users are advised to read the rest of the metadata record carefully for additional details.
No formal positional accuracy tests were conducted
No formal positional accuracy tests were conducted
Location information and basin characteristics for study gages were obtained from the "Reference" gages of the GAGES-II dataset (Falcone, 2011, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011). Index gages were USGS streamgages within each GAGES-II study region that were identified as being of “reference” quality in the GAGES-II dataset with at least 10 complete water years (WYs) during the study period from WY1981 through WY2017. Observed daily streamflow data for 1,385 index gages were retrieved from the National Water Information System (NWIS).
20180907
NNDAR, MCDAR, NNQPPQ, MCQPPQ, and OKDAR estimates of natural streamflow were computed at 1,385 streamgages. NNDAR and MCDAR estimates were computed following methods described by Farmer and others (2014). NNQPPQ and MCQPPQ estimates were computed following methods described by Farmer and others (2014), with updates to the flow-duration curve modeling as described by Over and others (2018). OKDAR estimates were computed using pooled variograms for each study region following methods described by Farmer (2016). Daily streamflow estimation was conducted in a leave-one-out-cross-validation approach where each streamgage was treated as if ungaged and all the remaining streamgages in a study region were used to calibrate each method and perform estimations at the "ungaged" site.
201811
region##.zip, where ## represents the GAGES-II HUC02 study region - contains an individual tab-delimited text file for each reference gage.
Estimates of daily streamflow at reference gages. Each text file is named output_#### where the #'s represent the USGS Station ID number (8-15 digits).
Producer defined
date
Date of observed and estimated streamflow
Producer defined
1980-10-01
2017-09-30
date in YYYY-MM-DD format
obs
Computed daily mean streamflow reported from NWIS for given USGS station.
U.S. Geological Survey
-2090
178000
cubic feet per second (cfs)
obsP
Exceedance probability of observed streamflow – the probability of the observed streamflow being equaled or exceeded on any given day. When the observed streamflow is zero, obsP is set to ‘NA’ in this study.
Producer defined
0
1
NNDAR
Estimate of daily streamflow using the Nearest-Neighbor Drainage Area Ratio (NNDAR) method. NNDAR estimates were computed following methods described by Farmer and others (2014).
Producer defined
-281.54
306485.8
cubic feet per second (cfs)
MCDAR
Estimate of daily streamflow using the Map-Correlation Drainage Area Ratio (MCDAR) method. MCDAR estimates were computed following methods described by Farmer and others (2014).
Producer defined
0
335188.9
cubic feet per second (cfs)
NNQPPQ
Estimate of daily streamflow using the nearest-neighbor nonlinear spatial interpolation using flow duration curves (NNQPPQ) method. NNQPPQ estimates were computed following methods described by Farmer and others (2014), with updates to the flow-duration curve modeling which is described by Over and others (2018).
Producer defined
0
15882161
cubic feet per second (cfs)
MCQPPQ
Estimate of daily streamflow using the map-correlation nonlinear spatial interpolation using flow duration curves (MCQPPQ) method. MCQPPQ estimates were computed following methods described by Farmer and others (2014), with updates to the flow-duration curve modeling which is described by Over and others (2018).
Producer defined
0
15882161
cubic feet per second (cfs)
OKDAR
Estimate of daily streamflow using the Ordinary Kriging of the logarithms of discharge per unit area (OKDAR) method. OKDAR estimates were computed using pooled variograms for each study region following methods described by Farmer (2016).
Producer defined
0
210531.9
cubic feet per second (cfs)
CONUS_PMs_byStation.csv
Computed Performance Metrics for each statistical method of streamflow estimation. Many of the performance metrics use a log-transformation of the daily streamflow; therefore, observed and simulated streamflows of zero were set to 0.001 cfs in computation of performance metrics.
Producer defined
Region
Study region for each streamgage
GAGES-II dataset (Falcone, 2011, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011)
01
18
StationID
Unique USGS streamgage identification number
U.S. Geological Survey - National Water Information System
U.S. Geological Survey Site Number
National Water Information System (NWIS) database
Method
Statistical method of daily time series estimation
Producer defined
NNDAR
Drainage area ratio method with nearest neighbor index selection
Producer defined
MCDAR
Drainage area ratio method with map correlation index selection
Producer defined
NNQPPQ
QPPQ simulation method with nearest neighbor index gage selection
Produer defined
MCQPPQ
QPPQ simulation method with map correlation index gage selection
Producer defined
OKDAR
Ordinary kriging of logarithms of discharge per unit area
Producer defined
PM
Performance metric used to evaluate streamflow estimates
Producer defined
nse
Nash-Sutcliffe efficiency of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
nsel
Nash-Sutcliffe efficiency of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmse
Root mean square error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmsne
Root mean square normalized error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
nrmse
Normalized root mean square error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
cvrmse
Coefficient of variation of the root mean square error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmsel
Root mean square error of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmsnel
Root mean square normalized error of log-transformed daily streamflows. This statistic is undefined when the divisor is zero. Any site with a single day of observed flow of 1 cfs will contain ‘Inf’ for this performance metric.
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
nrmsel
Normalized root mean square error of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
cvrmsel
Coefficient of variation of the root mean square error of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
perr
Average percent errors of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
meandiffl
Mean of the differences of the log-transformed daily streamflows
Producer defined
cor.p
Pearson correlations of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
cor.s
Spearman correlations of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
MeanRat
Ratio of mean of daily streamflows
Definition provided in SIR 2018-5072 (Over and others, 2018)
VarRat
Ratio of coefficient of variation of daily streamflows
Definition provided in SIR 2018-5072 (Over and others, 2018)
logMeanRat
Ratio of mean of log-transformed daily streamflows
Modified from SIR 2018-5072 (Over and others, 2018)
logVarRat
Ratio of coefficient of variation of log-transformed daily streamflows
Modified from SIR 2018-5072 (Over and others, 2018)
q0.###, where ### is a flow quantile
Ratio of 0.### quantile of daily streamflows
Definition provided in SIR 2018-5072 (Over and others, 2018)
Value
Numeric value of performance metric
Producer defined
-3.634639e+04
9.437883e+12
CONUS_PM__summaries_byRegion.csv
Summaries of Performance Metrics by study region with infinite values removed.
U.S. Geological Survey
Region
Study region
GAGES-II dataset (Falcone, 2011, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011)
01
18
Method
Statistical method of daily time series estimation
Producer Defined
NNDAR
Drainage area ratio method with nearest neighbor index selection
Producer defined
MCDAR
Drainage area ratio method with map correlation index selection
Producer defined
NNQPPQ
QPPQ simulation method with nearest neighbor index gage selection
Produer defined
MCQPPQ
QPPQ simulation method with map correlation index gage selection
Producer defined
OKDAR
Ordinary kriging of logarithms of discharge per unit area
Producer defined
PM
Performance metric used to evaluate streamflow estimates
Producer defined
nse
Nash-Sutcliffe efficiency of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
nsel
Nash-Sutcliffe efficiency of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmse
Root mean square error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmsne
Root mean square normalized error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
nrmse
Normalized root mean square error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
cvrmse
Coefficient of variation of the root mean square error of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmsel
Root mean square error of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
rmsnel
Root mean square normalized error of log-transformed daily streamflows. This statistic is undefined when the divisor is zero. Sites containing ‘Inf’ for this performance metric were not included in the summary statistic calculations.
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
nrmsel
Normalized root mean square error of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
cvrmsel
Coefficient of variation of the root mean square error of log-transformed daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
perr
Average percent errors of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
meandiffl
Mean of the differences of the log-transformed daily streamflows
Producer defined
cor.p
Pearson correlations of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
cor.s
Spearman correlations of daily streamflows
Definition provided in SIR 2014-5231 (Farmer and others, 2014)
MeanRat
Ratio of mean of daily streamflows
Definition provided in SIR 2018-5072 (Over and others, 2018)
VarRat
Ratio of coefficient of variation of daily streamflows
Definition provided in SIR 2018-5072 (Over and others, 2018)
logMeanRat
Ratio of mean of log-transformed daily streamflows
Modified from SIR 2018-5072 (Over and others, 2018)
logVarRat
Ratio of coefficient of variation of log-transformed daily streamflows
Modified from SIR 2018-5072 (Over and others, 2018)
q0.###, where ### is a flow quantile
Ratio of 0.### quantile of daily streamflows
Definition provided in SIR 2018-5072 (Over and others, 2018)
mean
Average value of PM for all gages in each region by method
Producer defined
-5.378312e+02
1.970374e+11
median
Median value of PM for all gages in each region by method
Producer defined
-1.878265e+00
5.554143e+06
min
Minimum value of PM for all gages in each region by method
Producer defined
-36346.39
40634.88
max
Maximum value of PM for all gages in each region by method
Producer defined
1.451997e-01
9.437883e+12
n
Number of reference gages in each region
Producer defined
13
199
reference_gages_summary.csv
Summary of station information
Producer defined
REGION
Study region as assigned by GAGES-II HUC02 field
GAGES-II dataset (Falcone, 2011, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011)
1
18
STAID
Unique USGS streamgage identification number
U.S. Geological Survey - National Water Information System
U.S. Geological Survey Site Number
National Water Information System (NWIS) database
MEAN_DAILY_FLOW
average of daily flow data for each site
Producer defined
0
8492.1
cubic feet per second (cfs)
DRAIN_SQKM
Drainage area
GAGES-II dataset (Falcone, 2011, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011)
1.5
25791.0
Square kilometers
START_DATE
Beginning date of streamflow record used in study
Producer defined
1980-10-01
2007-10-01
Date in YYYY-MM-DD format
END_DATE
Ending date of streamflow record used in study
Producer defined
1990-09-30
2017-09-30
Date in YYYY-MM-DD format
COMP_WY_RANGE
List of complete water years included in study
Producer defined
List of water years with complete streamflow records; Discontinuous time periods separated by ";"
COMP_WY_COUNT
Number of complete water years included in study
Producer defined
10
37
years
U.S. Geological Survey - ScienceBase
U.S. Geological Survey
mailing and physical
Denver Federal Center, Building 810, Mail Stop 302
Denver
CO
80225
USA
1-888-275-8747
sciencebase@usgs.gov
Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. Although this information product, for the most part, is in the public domain, it also contains copyrighted materials as noted in the text. Permission to reproduce copyrighted items must be secured from the copyright owner whenever applicable. The data have been approved for release and publication by the U.S. Geological Survey (USGS). Although the data have been subjected to rigorous review and are substantially complete, the USGS reserves the right to revise the data pursuant to further analysis and review. Furthermore, the data are released on the condition that neither the USGS nor the U.S. Government may be held liable for any damages resulting from authorized or unauthorized use. Although the data have been processed successfully on a computer system at the U.S. Geological Survey, no warranty expressed or implied is made regarding the display or utility of the data on any other system, or for general or scientific purposes, nor shall the act of distribution constitute any such warranty. The U.S. Geological Survey shall not be held liable for improper or incorrect use of the data described and/or contained herein. Users of the data are advised to read all metadata and associated documentation thoroughly to understand appropriate use and data limitations.
The zip files contain data in tab-delimited text files. The user must have software capable of uncompressing the zipped file.
20200813
Amy M Russell
CENTRAL MIDWEST WATER SCIENCE CENTER
Supervisory Hydrologist
mailing and physical
405 N. Goodwin Avenue
Urbana
IL
61801
United States
217-328-9773
arussell@usgs.gov
Content Standard for Digital Geospatial Metadata
FGDC-STD-001-1998