<?xml version='1.0' encoding='UTF-8'?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <idinfo>
    <citation>
      <citeinfo>
        <origin>Richard A Erickson</origin>
        <pubdate>20191003</pubdate>
        <title>Simulated eDNA Occurrence data and Stan summaries of data</title>
        <geoform>tabular digital data</geoform>
        <pubinfo>
          <pubplace>Online</pubplace>
          <publish>U.S. Geological Survey</publish>
        </pubinfo>
        <onlink>https://doi.org/10.5066/P9WRFUDQ</onlink>
        <lworkcit>
          <citeinfo>
            <origin>Richard A Erickson</origin>
            <origin>Christopher M. Merkes</origin>
            <origin>Erica L. Mize</origin>
            <pubdate>20190901</pubdate>
            <title>Sampling designs for landscape-level eDNA monitoring programs using three-level occurrence models</title>
            <geoform>publication</geoform>
            <pubinfo>
              <pubplace>Integrated Environmental Assessment and Management</pubplace>
              <publish>Wiley</publish>
            </pubinfo>
            <onlink>https://doi.org/10.1002/ieam.4155</onlink>
          </citeinfo>
        </lworkcit>
      </citeinfo>
    </citation>
    <descript>
      <abstract>Resource managers conduct landscape-level monitoring using environmental DNA (eDNA). These managers must contend with imperfect detection in samples and sub-samples (i.e., molecular analyses). This imperfect detection impacts their ability to both detect species and estimate occurrence. Although occurrence (synonymously occupancy) models can estimate these probabilities, most models and guidance for their application do not consider three levels. This simulated dataset assumes sites are occupied (probably psi =1, Z = 1 ) and simulates sample (probability theta, A = 0,1) and subsample (probability p, Y = 0, 1) occurrence probabilies and detections (1)/non-detections (0).</abstract>
      <purpose>These data were simulated to evaluate the ability of a statistical model to recover known parameter values.</purpose>
      <supplinf>Simulated data</supplinf>
    </descript>
    <timeperd>
      <timeinfo>
        <sngdate>
          <caldate>20171219</caldate>
        </sngdate>
      </timeinfo>
      <current>See Supplemental Info</current>
    </timeperd>
    <status>
      <progress>Complete</progress>
      <update>None planned</update>
    </status>
    <spdom>
      <bounding>
        <westbc>-180.0</westbc>
        <eastbc>180.0</eastbc>
        <northbc>90.0</northbc>
        <southbc>-90.0</southbc>
      </bounding>
      <descgeog>World</descgeog>
    </spdom>
    <keywords>
      <theme>
        <themekt>ISO 19115 Topic Category</themekt>
        <themekey>biota</themekey>
      </theme>
      <theme>
        <themekt>None</themekt>
        <themekey/>
      </theme>
      <theme>
        <themekt>Marine Realms Information Bank (MRIB) keywords</themekt>
        <themekey>numerical modeling</themekey>
      </theme>
      <theme>
        <themekt>USGS Thesaurus</themekt>
        <themekey>environmental DNA</themekey>
      </theme>
      <theme>
        <themekt>USGS Metadata Identifier</themekt>
        <themekey>USGS:5d96239be4b0c4f70d110ebe</themekey>
      </theme>
    </keywords>
    <accconst>None.  Please see 'Distribution Info' for details.</accconst>
    <useconst>None.  Users are advised to read the data set's metadata thoroughly to understand appropriate use and data limitations.</useconst>
    <ptcontac>
      <cntinfo>
        <cntperp>
          <cntper>Richard A Erickson</cntper>
          <cntorg>U.S. Geological Survey, Midwest Region</cntorg>
        </cntperp>
        <cntpos>Fish Biologist</cntpos>
        <cntaddr>
          <addrtype>mailing address</addrtype>
          <address>2630 Fanta Reed Road</address>
          <city>La Crosse</city>
          <state>WI</state>
          <postal>54603</postal>
          <country>United States</country>
        </cntaddr>
        <cntvoice>608-781-6353</cntvoice>
        <cntemail>rerickson@usgs.gov</cntemail>
      </cntinfo>
    </ptcontac>
    <datacred>USFWS</datacred>
  </idinfo>
  <dataqual>
    <attracc>
      <attraccr>Code producing outputs files is undergoing release as another product</attraccr>
    </attracc>
    <logic>No formal logical accuracy tests were conducted.</logic>
    <complete>Data set is considered complete for the information presented, as described in the abstract.  Users are advised to read the rest of the metadata record carefully for additional details.</complete>
    <posacc>
      <horizpa>
        <horizpar>A formal accuracy assessment of the horizontal positional information in the data set has not been conducted.</horizpar>
      </horizpa>
      <vertacc>
        <vertaccr>A formal accuracy assessment of the vertical positional information in the data set has either not been conducted, or is not applicable.</vertaccr>
      </vertacc>
    </posacc>
    <lineage>
      <procstep>
        <procdesc>These data were simulated using R to simulate the following equations:
Ai,j|Zi ~ Bernoulli( Zi i,j) 
for sample detections and 
Yi,j,k|Ai,j ~ Bernoulli( Ai,j pi,j)
for subsample detections. A, Z, Y, theta, and p are defined in the abstract. i is the site level, j is the same level, and k is the subsample. In total, 800 different parameter combinations were used to simulate 100 realizations of each combination for a total of 800 simulated data sets. The specific parameter combinations used are listed in the parameterValues.csv</procdesc>
        <procdate>20171219</procdate>
      </procstep>
    </lineage>
  </dataqual>
  <eainfo>
    <detailed>
      <enttyp>
        <enttypl>stan Summary data CSVs</enttypl>
        <enttypd>Comma Separated Value (CSV) file containing data. The files were named for a computer language that starts with 0 and the indexes were named for a computer language that starts with 1.  Each parameter (p, theta, psi) has a “recovered” point estimate, Monte Carlo standard error, and lower and upper bounds to a 95% credibility interval.  NAs indicate that the model could not fit a simulate dataset because there were no detections in the simulated data.</enttypd>
        <enttypds>Producer defined</enttypds>
      </enttyp>
      <attr>
        <attrlabl>ParameterIndex</attrlabl>
        <attrdef>which is the parameter combination used for the file and corresponds to the file name +1. 
The files were named for a computer language that starts with 0 and the indexes were named for a computer language that starts with 1.</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <edom>
            <edomv>1</edomv>
            <edomvd/>
            <edomvds>Producer defined</edomvds>
          </edom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>pRecovered</attrlabl>
        <attrdef>which is the recovered point estimate for parameter p</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>thetaRecovered</attrlabl>
        <attrdef>which is the recovered point estimate for parameter theta</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>psiRecovered</attrlabl>
        <attrdef>which is the recovered point estimate for parameter psi</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>pRecoveredSE</attrlabl>
        <attrdef>which is the Monte Carlo Standard Error for parameter p</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>thetaRecoveredSE</attrlabl>
        <attrdef>which is the Monte Carlo Standard Error for parameter theta</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>psiRecoveredSE</attrlabl>
        <attrdef>which is the Monte Carlo Standard Error for parameter psi</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>pRecoveredLower</attrlabl>
        <attrdef>which is the lower bound of the 95% Credibility Interval for p</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>thetaRecoveredLower</attrlabl>
        <attrdef>which is the lower bound of the 95% Credibility Interval for theta</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>psiRecoveredLower</attrlabl>
        <attrdef>which is the lower bound of the 95% Credibility Interval for psi</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>pRecoveredUpper</attrlabl>
        <attrdef>which is the upper bound of the 95% Credibility Interval for p</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>thetaRecoveredUpper</attrlabl>
        <attrdef>which is the upper bound of the 95% Credibility Interval for theta</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>psiRecoveredUpper</attrlabl>
        <attrdef>which is the upper bound of the 95% Credibility Interval for psi</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>0</rdommin>
            <rdommax>1</rdommax>
          </rdom>
        </attrdomv>
      </attr>
    </detailed>
    <detailed>
      <enttyp>
        <enttypl>simulated Data csv</enttypl>
        <enttypd>Comma Separated Value (CSV) file containing data. The files were named for a computer language that starts with 0 and the indexes were named for a computer language that starts with 1.</enttypd>
        <enttypds>Producer defined</enttypds>
      </enttyp>
      <attr>
        <attrlabl>parameterIndex</attrlabl>
        <attrdef>which is the parameter combination used for the file and corresponds to the file name +1. The files were named for a computer language that starts with 0 and the indexes were named for a computer language that starts with 1.</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <edom>
            <edomv>1</edomv>
            <edomvd/>
            <edomvds>Producer defined</edomvds>
          </edom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>Zindex</attrlabl>
        <attrdef>Zindex is an index for sites. Currently, this is fixed at one, but could be changed if we simulated more sites per sample..</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <edom>
            <edomv>1</edomv>
            <edomvd/>
            <edomvds>Producer defined</edomvds>
          </edom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>Aindex</attrlabl>
        <attrdef>Aindex is the sample index within each site. If there were more than 1 site in the simulated dataset, this value would not be unique.</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>1</rdommin>
            <rdommax>5</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>Yindex</attrlabl>
        <attrdef>Yindex is the observation index and is unique for each row.</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <rdom>
            <rdommin>1</rdommin>
            <rdommax>5</rdommax>
          </rdom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>Y_*</attrlabl>
        <attrdef>is the simulated Y value for * replicate</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <udom>is the simulated Y value for * replicate.  Zeros are non-detects and ones are detects.</udom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>A_*</attrlabl>
        <attrdef>is the simulated A value for * replicate</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <udom>is the simulated A value for * replicate.  Zeros are non-detects and ones are detects.</udom>
        </attrdomv>
      </attr>
      <attr>
        <attrlabl>Z_*</attrlabl>
        <attrdef>is the simulated Z value for * replicate</attrdef>
        <attrdefs>Producer defined</attrdefs>
        <attrdomv>
          <udom>is the simulated Z value for * replicate.  Zeros are non-detects and ones are detects.</udom>
        </attrdomv>
      </attr>
    </detailed>
  </eainfo>
  <distinfo>
    <distrib>
      <cntinfo>
        <cntorgp>
          <cntorg>U.S. Geological Survey - ScienceBase</cntorg>
        </cntorgp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>Denver Federal Center, Building 810, Mail Stop 302</address>
          <city>Denver</city>
          <state>CO</state>
          <postal>80225</postal>
          <country>USA</country>
        </cntaddr>
        <cntvoice>1-888-275-8747</cntvoice>
        <cntemail>sciencebase@usgs.gov</cntemail>
      </cntinfo>
    </distrib>
    <distliab>Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data on any other system or for general or scientific purposes, nor shall the act of distribution constitute any such warranty.</distliab>
  </distinfo>
  <metainfo>
    <metd>20210523</metd>
    <metc>
      <cntinfo>
        <cntperp>
          <cntper>Richard A Erickson</cntper>
          <cntorg>U.S. Geological Survey, Midwest Region</cntorg>
        </cntperp>
        <cntpos>Fish Biologist</cntpos>
        <cntaddr>
          <addrtype>mailing address</addrtype>
          <address>2630 Fanta Reed Road</address>
          <city>La Crosse</city>
          <state>WI</state>
          <postal>54603</postal>
          <country>United States</country>
        </cntaddr>
        <cntvoice>608-781-6353</cntvoice>
        <cntemail>rerickson@usgs.gov</cntemail>
      </cntinfo>
    </metc>
    <metstdn>FGDC Biological Data Profile of the CDGSM</metstdn>
    <metstdv>FGDC-STD-001.1-1999</metstdv>
    <metuc>Record created using USGS Metadata Wizard tool. (https://github.com/usgs/fort-pymdwizard)</metuc>
  </metainfo>
</metadata>
