Samuel H Austin
20210903
Terms, Statistics, and Performance Measures for Maximum Likelihood Logistic Regression Models Estimating Hydrological Drought Probabilities in the Northeastern United States (2019)
Numerical, tabular data in comma separated value (.csv) format.
Reston, VA
U.S. Geological Survey
https://doi.org/10.5066/P9E3SK56
https://doi.org/10.5066/P9E3SK56
Samuel H. Austin
2021
Forecasting drought probabilities for streams in the northeastern United States
publication
n/a
US Geological Survey
https://doi.org/10.3133/sir20215084
Tables are presented listing parameters used in logistic regression equations describing drought streamflow probabilities in the Northeastern United States. Streamflow daily data, streamflow monthly mean data, maximum likelihood logistic regression (MLLR) equation explanatory parameters, equation goodness of fit parameters, and Receiver Operating Characteristic (ROC) AUC values identifying the utility of each relation, describe each model of the probability (chance) of a particular streamflow daily value exceeding or not exceeding an identified drought streamflow threshold. These models are key inputs to drought forecasting web applications for the northeastern United states {https://usgs.maps.arcgis.com/apps/MapSeries/index.html?appid=b8c5da617a0e4d628e3e39f7dbd512da}
The data were obtained and developed to identify and describe terms used in maximum likelihood logistic regression (MLLR) models estimating streamflow drought probabilities at selected USGS gaged basins in the northeastern United States.
18840101
20181031
ground condition
As needed
-84.067382812719
-66.137695313435
47.768574328315
35.977652111331
USGS Thesaurus
decision support systems
drought
forecasting
risk assessment
streamflow
surface water hydrology
water supply
watershed management
USGS Metadata Identifier
USGS:5d681d5de4b0c4f70cf15c9b
Geographic Names Information System (GNIS)
Connecticut
District of Columbia
Delaware
Massachusetts
Maryland
Maine
New Hampshire
New Jersey
New York
Pennsylvania
Rhode Island
Virginia
Vermont
West Virginia
none
none
Samuel H Austin
Northeast Region: VA AND WV WSC
Hydrologist
mailing and physical
1730 East Parham Road
Richmond
VA
23228
United States
804-261-2620
saustin@usgs.gov
Statistical analyses indicate that attribute values are correct.
Actual data fall within expected ranges. Dataset was checked for duplication, errors, and omissions. None were found.
Data set is considered complete for the information presented as described in the abstract. All sites identified and accepted for study had data catalogued in the National Water Information System (NWIS) exceeding a 10 year time interval identified as a minimum for study. Users are advised to read the full metadata record for additional details.
Formal positional accuracy tests were not applicable to these data.
Formal positional accuracy tests were not applicable to these data.
Streamflow daily data were accessed and downloaded from USGS NWIS.
20181203
Data were inspected and statistically evaluated to ensure values accepted into NWIS fell within statistically reasonable ranges and did not included coding or transcribing errors.
2019
Models were developed using Maximum Likelihood Logistic Regression (MLLR).
2019
Models were tested and evaluated using methods described in the Scientific Investigations Report accompanying this data release "Forecasting Drought Probabilities for Streams in the Northeast United States" (SIR 2021-XXXX).
2019
Models were published.
2021
Data are model parameters and streamflow daily value measurements collected at selected USGS stream gages located within the following states and districts: Connecticut (CT), Washington D.C (DC), Delaware (DE), Massachusetts (MA), Maryland (MD), Maine (ME), New Hampshire (NH), New Jersey (NJ), New York (NY), Pennsylvania (PA), Rhode Island (RI), Virginia (VA), Vermont (VT), and West Virginia (WV). Daily value measurement data may be located by USGS stream gage number.
Point
Austin-Northeastern US Drought Study Data Release File 1
The dataset describes the terms used in each MLLR model. These are the final output models that represent future drought probabilities. They may be used to populate equations describing model output. Data are presented as 12 comma separated values per row. Each row presents values for 1 MLLR model.
Samuel H. Austin
STATION NUMBER
The USGS gaging station number.
USGS
National Water Information System (NWIS)
National Water Information System (NWIS) https://waterdata.usgs.gov/nwis/
State
The northeastern State abbreviation associated with the data entry.
USGS
FIPS codes
https://catalog.data.gov/dataset/fips-state-codes
Response Variable (y)
The label identifying the probability prediction response variable (y-value).
Producer defined.
A label identifying the state and month associated with the Response Variable (y).
Explanatory Variable (x)
The label identifying the month and probability prediction mean monthly streamflow explanatory variable (x-value).
Producer defined.
A label identifying the month associated with the Explanatory Variable (x).
Model Long Label
A label consisting of concatenated text listing unique descriptors identifying the model elements associated with the model record.
Producer defined.
A label consisting of concatenated text listing unique descriptors identifying the model elements associated with the model record.
Intercept Estimate
The numerical value associated with model term B0.
Producer defined.
The numerical value associated with model term B0.
Slope Estimate
The numerical value associated with model term B1.
Producer defined.
The numerical value associated with model term B1.
Intercept Estimate Standard Error
A numerical value describing the standard error associated with model term B0.
Producer defined.
A numerical value describing the standard error associated with model term B0.
Slope Estimate Standard Error
A numerical value describing the standard error associated with model term B1.
Producer defined.
A numerical value describing the standard error associated with model term B1.
Intercept Estimate ChiSquare
A numerical value describing the ChiSquare estimate associated with model term B0.
Producer defined.
A numerical value describing the ChiSquare estimate associated with model term B0.
Slope Estimate ChiSquare
A numerical value describing the ChiSquare estimate associated with model term B1.
Producer defined.
A numerical value describing the ChiSquare estimate associated with model term B1.
Intercept Estimate Prob. ChiSq
A numerical value describing the probability value (p-value) associated with the Intercept Estimate ChiSquare.
Producer defined.
A numerical value describing the probability value (p-value) associated with the Intercept Estimate ChiSquare.
Slope Estimate Prob. ChiSq
A numerical value describing the probability value (p-value) associated with the Slope Estimate ChiSquare.
Producer defined.
A numerical value describing the probability value (p-value) associated with the Slope Estimate ChiSquare.
Intercept Estimate ~Bias
A word or phrase describing any bias associated with the Intercept Estimate.
Producer defined.
A word or phrase describing any bias associated with the Intercept Estimate.
Slope Estimate ~Bias
A word or phrase describing any bias associated with the Slope Estimate.
Producer defined.
A word or phrase describing any bias associated with the Slope Estimate.
ROC AUC P[N]
A numerical value identifying the Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) for the model results describing "the probability of no" P[N] meaning "not exceeding the drought minimum flow threshold." Please reference the companion Scientific Investigations Report "Forecasting Drought Probabilities for Streams in the Northeast United States" (SIR 2021-XXXX) for detailed explanation.
Producer defined.
0
1
ROC AUC P[Y]
A numerical value identifying the Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) for the model results describing "the probability of yes" P[Y] meaning "exceeding the drought minimum flow threshold." Please reference the companion Scientific Investigations Report "Forecasting Drought Probabilities for Streams in the Northeast United States" (SIR 2021-XXXX) for detailed explanation.
Producer defined.
0
1
ROC AUC on Validation Data P[N]
A numerical value identifying the Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) for the model validation data describing "the probability of no" P[N] meaning "not exceeding the drought minimum flow threshold." Please reference the companion Scientific Investigations Report "Forecasting Drought Probabilities for Streams in the Northeast United States" (SIR 2021-XXXX) for detailed explanation.
Producer defined.
0
1
ROC AUC on Validation Data P[Y]
A numerical value identifying the Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) for the model validation data describing "the probability of yes" P[Y] meaning "exceeding the drought minimum flow threshold." Please reference the companion Scientific Investigations Report "Forecasting Drought Probabilities for Streams in the Northeast United States" (SIR 2021-XXXX) for detailed explanation.
Producer defined.
0
1
ROC AUC on Test Data P[N]
A numerical value identifying the Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) for the model test data describing "the probability of no" P[N] meaning "not exceeding the drought minimum flow threshold." Please reference the companion Scientific Investigations Report "Forecasting Drought Probabilities for Streams in the Northeast United States" (SIR 2021-XXXX) for detailed explanation.
Producer defined.
0
1
ROC AUC on Test Data P[Y]
A numerical value identifying the Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) for the model test data describing "the probability of yes" P[Y] meaning " exceeding the drought minimum flow threshold." Please reference the companion Scientific Investigations Report "Forecasting Drought Probabilities for Streams in the Northeast United States" (SIR 2021-XXXX) for detailed explanation.
Producer defined.
0
1
Austin-Northeastern US Drought Study Data Release File 2
The dataset lists the source data used in MLLR model. Data are presented as 5 comma separated values per row. Dataset contains 24,628,658 records and may not be easily opened with standard spreadsheet software.
USGS National Water Information System (NWIS) https://waterdata.usgs.gov/nwis/
Agency
An abbreviation identifying the agency from which the data were sourced.
U.S. Geological Survey
National Water Information System (NWIS)
National Water Information System (NWIS) https://waterdata.usgs.gov/nwis/
Station Number
The USGS gaging station number.
USGS
National Water Information System (NWIS)
National Water Information System (NWIS) https://waterdata.usgs.gov/nwis/
Date
The year, month, and day that daily value (DV) data were recorded.
USGS
National Water Information System (NWIS)
National Water Information System (NWIS) https://waterdata.usgs.gov/nwis/
DV
A numerical value describing a streamflow daily value (DV) datum in cubic feet per second.
USGS
National Water Information System (NWIS)
National Water Information System (NWIS) https://waterdata.usgs.gov/nwis/
DV code
Text code describing condition of a daily value (DV) datum. Please reference the National Water Information System (NWIS) code reference https://help.waterdata.usgs.gov/codes-and-parameters for detailed information.
USGS
National Water Information System (NWIS)
National Water Information System (NWIS) https://waterdata.usgs.gov/nwis/
Samuel H Austin
Northeast Region: VA AND WV WSC
Hydrologist
mailing and physical
1730 East Parham Road
Richmond
VA
23228
United States
804-261-2620
saustin@usgs.gov
This database, identified as "Terms, Statistics, and Performance Measures for Maximum Likelihood Logistic Regression Models Estimating Hydrological Drought Probabilities in the Northeastern United States (2019)", has been approved for release by the U.S. Geological Survey (USGS). Although this database has been subjected to rigorous review and is substantially complete, the USGS reserves the right to revise the data pursuant to further analysis and review. Furthermore, the database is released on condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from its authorized or unauthorized use."
20210903
U.S. Geological Survey - ScienceBase
Denver Federal Center, Building 810, Mail Stop 302
Denver
CO
80225
United States
1-888-275-8747
sciencebase@usgs.gov
