Spatial Extent of Data

USGS Data Source

ISO 19115 Topic Category

biota

Other Subject Keywords

Pollock’s robust design, simulated dataset, multi-scale occupancy model, temporary emigration, site-occupancy model, imperfect detection probability

Place Keywords

MA, United States, Simulated dataset

Data for: Ignoring species availability biases occupancy estimates in single-scale occupancy models

We simulate over 28,000 datasets and saved their model outputs to answer the following three questions: (1) what is an adequate sampling design for the multi-scale occupancy model when there are a priori expectations of parameter estimates?, (2) what is an adequate sampling design when we have no expectations of parameter estimates?, and (3) what is the cost (in terms of bias, accuracy, precision and coverage) in occupancy estimates) if availability is not accounted for? Specifically, we simulated data under four scenarios: Scenario 1 (n = 10,000): Species availability is constant across sites (but less than one), Scenario 2 (n = 9,358): Species availability is heterogenous across sites, Scenario 3 (n = 2,815): Species availability is heterogenous across years, and Scenario 4 (n = 5,942): Species availability is correlated to their detection probability. Then, for each scenario except the first, we analyzed the data using four different estimators: (i) constant multi-scale occupancy model, (ii) multi-scale occupancy model with a random-effects term in the availability part of the model, (iii) constant single-scale occupancy model, and (iv) single-scale occupancy model with a random-effects term in the detection part of the model. Note the formulation of the random-effects terms included in the models mimicked the way that data were simulated (e.g., if species availability was heterogenous across sites, then a site random-effects term was included in the models). The first scenario was analyzed using models (i) and (iii) only. For simplicity, we refer to models (i) and (iii) as ‘constant’ models or 'fixed-effects' models. We refer to models (ii) and (iv) as ‘random-effects’ models. The summary of simulated data and model estimates are located in four folders, each corresponding to a different simulated scenario: Scenario 1 (n = 10,000): Folder ModelOutput_Scen1_TwolevelSim = csv files holding data are named Results_TwoLevelAvail_2lev_x.csv Scenario 2 (n = 9,358): Folder ModelOutput_Scen2_HeteroSite = csv files holding data are named Results_TwoLevelAvail_Hetero_x.csv Scenario 3 (n = 2,815): Folder ModelOutput_Scen3_HeteroYear = csv files holding data are named Results_TwoLevelAvail_HeteroSeason_x.csv Scenario 4 (n = 5,942): Folder ModelOutput_Scen4_Cor = csv files holding data are named Results_TwoLevelAvail_Cor_x.csv Each row in each of the csv files contains information related to a different simulated dataset and includes information related to: sampling design, true parameter values, and model estimates. Other files in the folder correspond to the entire model output (.rda files), time for model run to complete (time_..csv), and a file indicating whether or not the model run finished (nsim...csv). For more information related to those files, we point the user to the code that generated them: Scenario 1 (n = 10,000): Scen1_Constant.R Scenario 2 (n = 9,358): Scen2_HeteroSite.R Scenario 3 (n = 2,815): Scen3_HeteroYear.R Scenario 4 (n = 5,942): Scen4_Corr.R

Get Data and Metadata

Author(s)	Graziella V. DiRenzo, David A. W., Evan H. C.
Publication Date	2022-01-24
Beginning Date of Data	2020-06-08
Ending Date of Data	2020-06-08
Data Contact	Graziella DiRenzo
DOI	This item doesn't have a registered DOI.
Citation	Check repository for data citation.
Metadata Contact	Graziella DiRenzo
Metadata Date	2021-06-16
Related Publication	There was no related primary publication associated with this data release.
Citations of these data	No citations of these data are known at this time.
Access	public
License	http://www.usa.gov/publicdomain/label/1.0/

Harvest Source: Individual Metadata Uploader
Harvest Date: 2025-02-13T05:04:31.305Z