Josh Woda
Jason Finkelstein
John Williams
20220502
Interpolation statistics for the Ellicottville sourcewater study area in upstate New York
Data Release
Troy, NY
U.S. Geological Survey
https://doi.org/10.5066/P96R5K5R
Nicholas Corson-Dosch
Michael N. Fienen
Jason S. Finkelstein
Andrew T. Leaf
Jeremy T. White
Joshua C. Woda
John H. Williams
2022
Areas contributing recharge to priority wells in valley-fill aquifers in the Neversink River and Rondout Creek drainage basins, New York
publication
n/a
US Geological Survey
https://doi.org/10.3133/sir20215112
This dataset includes spreadsheets with statistical data (mean and median absolute error) used in deciding which interpolation method best fit the corresponding dataset. All statistical data were paired with a visual inspection of the interpolation prior to determining the final raster product. All spreadsheets were generated using an automated Python script (Jahn, 2020).
The purpose of this dataset was to show the statistical error for each interpolation method that was tested using the automated Python script (Jahn, 2020).
2020
2021
ground condition
None planned
-78.818591306343
-78.499987790731
42.417284443107
42.197913871538
USGS Thesaurus
Sourcewater
Valley Fill
Bedrock Mapping
Aquifer mapping
Glacial geology
USGS Metadata Identifier
USGS:6050cd69d34eb1203124bda1
Getty Thesaurus of Geographic Names
Cattaraugus County
New York
Ellicottville
none
none
Joshua C Woda
NORTHEAST REGION: NEW YORK WATER SCI CTR
Hydrologist
mailing and physical
425 Jordan Road
Troy
NY
12180
US
518-285-5606
jwoda@usgs.gov
No formal attribute accuracy tests were conducted
Data had varying degrees of accuracy depending on the interpolation method. However, even though an interpolation method might have a small amount of error, it did not guarantee that it would make geologic sense. This could include interpolation artifacts or other irregularities, which could ruled out the interpolation. Visual checks were performed to account for this.
Dataset is considered complete for the information presented, as described in the abstract. Users are advised to read the rest of the metadata record carefully for additional details.
No formal positional accuracy tests were conducted
No formal positional accuracy tests were conducted
Kalle Jahn
2020
arcpro-interpol-compare
Python Script
Github
Github
https://github.com/kallejahn/arcpro-interpol-compare
Python Script
20190701
20200113
complete
Interpolation Testing
A Jupyter Notebook intended to reduce the amount of time spent manually comparing results from interpolation tools in ArcGIS Pro.
The spreadsheets presented here (and ultimately the associated digital surfaces) were created using an array of point and contour inputs and common interpolation methods including: Topo to Raster, Natural Neighbors, Kriging, and Bayesian Kriging. Interpolation methods were calibrated and validated for each dataset using a stepwise statistical testing approach that has since been automated (Jahn, 2020). First, the geostatistical wizard tool within ArcGIS was used to generate a statistical summary of accuracy for every available interpolation models in ArcMap 10.7. Additionally for each model, 10% of all data points were excluded from the initial interpolation. A raster was then generated for the remaining 90% of points. Using this newly generated raster, the absolute difference was calculated at every data point. Error analysis on the 10% of the points not used in the interpolation were also calculated. Mean and median error for the 10% of points not used indicated how well the interpolation method performed away from the input data (the 90% of points that were used for interpolation). The models with the smallest amount of mean and median absolute difference were chosen taking into consideration statistics, visual inspection and knowledge of the local geology. Typically, the Topo to Raster or Natural Neighbors interpolations were chosen based on a combination of statistics and visual inspection.
2020
1.0E-5
1.0E-5
Decimal degrees
D_North_American_1983
NAD_1983
6378137.0
298.257222101
North American Vertical Datum of 1988
0.01
meter
Explicit elevation coordinate included with horizontal coordinates
Bedrock_Valley_interpolation
This spreadsheet contains interpolation testing statistics for the Ellicottville bedrock valley interpolation
The authors of this dataset conceived these values
model
This column contains the name of the interpolation model being tested. Models tested include: Topo to Raster, Natural Neighbors, multiple Kriging methods, and multiple Bayesian Kriging methods.
Producer defined
This column contains the name of the interpolation model being tested. Models tested include Topo to Raster, Natural Neighbors, multiple Kriging methods, and multiple Bayesian Kriging methods.
Mean Err - training pts
This column contains the mean deviation (absolute difference) from values associated with each of the individual training points (the 90% of points used in the generation of the test interpolation).
Producer defined
0.643032578
1.531578944
meters
Median Err - training pts
This column contains the median deviation (absolute difference) from values associated with each of the individual training points (the 90% of points used in the generation of the test interpolation).
Producer defined
0.039855957
0.633239746
meters
Mean Err - testing pts
This column contains the mean deviation (absolute difference) from values associated with each of the individual testing points (the 10% of points intentionally left out of the interpolation for testing purposes).
Producer defined
1.315269333
2.335140863
meters
Median Err - testing pts
This column contains the median deviation (absolute difference) from values associated with each of the individual testing points (the 10% of points intentionally left out of the interpolation for testing purposes).
Producer defined
0.470465041
1.106444502
meters
Notes
Assorted comments pertaining to the individual rows. Usually indicates which interpolation method was chosen for the final product.
Producer defined
Assorted comments pertaining to the individual rows. Usually indicates which interpolation method was chosen for the final product.
Bedrock_uplands_interpolation
This spreadsheet contains interpolation testing statistics for the Ellicottville bedrock upland interpolation
The authors of this dataset conceived these values
model
This column contains the name of the interpolation model being tested. Models tested include: Topo to Raster, Natural Neighbors, multiple Kriging methods, and multiple Bayesian Kriging methods.
Producer defined
This column contains the name of the interpolation model being tested. Models tested include Topo to Raster, Natural Neighbors, multiple Kriging methods, and multiple Bayesian Kriging methods.
Mean Err - training pts
This column contains the mean deviation (absolute difference) from values associated with each of the individual training points (the 90% of points used in the generation of the test interpolation).
Producer defined
0.019404373
0.107720314
meters
Median Err - training pts
This column contains the median deviation (absolute difference) from values associated with each of the individual training points (the 90% of points used in the generation of the test interpolation).
Producer defined
9.54E-08
0.052231216
meters
Mean Err - testing pts
This column contains the mean deviation (absolute difference) from values associated with each of the individual testing points (the 10% of points intentionally left out of the interpolation for testing purposes).
Producer defined
0.130140937
0.189120719
meters
Median Err - testing pts
This column contains the median deviation (absolute difference) from values associated with each of the individual testing points (the 10% of points intentionally left out of the interpolation for testing purposes).
Producer defined
9.54E-08
0.052293801
meters
Notes
Assorted comments pertaining to the individual rows. Usually indicates which interpolation method was chosen for the final product.
Producer defined
Assorted comments pertaining to the individual rows. Usually indicates which interpolation method was chosen for the final product.
ScienceBase
U.S. Geological Survey
mailing and physical
Denver Federal Center, Building 810, Mail Stop 302
Denver
CO
80225
USA
1-888-275-8747
sciencebase@usgs.gov
Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data for other purposes, nor on all computer systems, nor shall the act of distribution constitute any such warranty. Not for navigational use. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
20240514
Josh Woda
The United States Geologic Survey
Hydrologist
mailing and physical
425 Jordan Road
Troy
New York
12180
The United States
518-285-5606
jwoda@USGS.gov
Content Standard for Digital Geospatial Metadata
FGDC-STD-001-1998