DNA Sequencing of Selected Bacterial Growths in Samples from the Madera/Chowchilla-Kings Domestic Aquifer Study unit, 2014

ANDREW L. SOLDAVINI CARMEN A. BURTON CHRISTINE J. LAWRENCE 20210622 DNA Sequencing of Selected Bacterial Growths in Samples from the Madera/Chowchilla-Kings Domestic Aquifer Study unit, 2014 csv spreadsheet, text file https://doi.org/10.5066/P9X7JH11 Jennifer L. Shelton Miranda S. Fram 2017 Groundwater-quality data for the Madera/Chowchilla–Kings shallow aquifer study unit, 2013–14: Results from the California GAMA Program publication Data Series 1019 n/a US Geological Survey https://doi.org/10.3133/ds1019 These data describe microbiological analyses performed on groundwater samples from domestic drinking water supply collected from 42 groundwater wells in the Central Valley of California. Samples were collected between January 2014 and April 2014 for the Groundwater Ambient Monitoring and Assessment (GAMA) program priority basin assessment of the Madera, Chowchilla, and Kings (MACK) groundwater sub-basins’ shallow aquifers. A total of 75 wells were sampled for the MACK study unit between August 2013 and April 2014. Samples for this dataset were vacuum filtered and plated on MI and mEI agars prior to incubation to promote colony growth. Colonies were tallied by their species into columns for various fecal indicator bacteria (FIBs): total coliforms (TCs), Escherichia coli (E. coli), enterococci. Non-target growths were also counted and tallied. Six additional replicate samples were collected for quality assurance. Of the 579 total FIB colonies detected, 106 were selected for polymerase chain reaction (PCR) analysis with the goal of sequencing their DNA. Selected colonies consisted of both target and non-target growths and were taken from 14 samples collected at 13 different wells. DNA sequencing was successful for 34 of the sampled colonies out of a total of 59 submitted. Results for these analyses were reported in FASTA format with the number of bases and their starting position indicated for each batch. The purpose of this data release is to distribute supplementary microbiological data collected concurrently with water quality data for the Groundwater Ambient Monitoring and Assessment Program Priority Basin Project Madera/Chowchilla–Kings (MACK) Shallow Aquifer Study Unit of 2014. Primary data for the MACK study unit included a broad array of inorganic and organic constituents and age dating and geochemical tracers that were released in DSR 1019 (Shelton and Fram, 2017) cross referenced to this dataset. This document provides metadata necessary to understand the results included in a comma delineated table (csv file). DNA sequences were created through methods listed in the process steps in the data quality section of this document. 20140128 20140409 ground condition Complete None planned -120.6000 -119.1000 37.2700 36.2200 SanJoaquin Valley, CA ISO 19115 Topic Category biota environment geoscientificInformation USGS Thesaurus biochemistry environmental DNA domestic well water use drinking water use groundwater water quality water sampling Groundwater Ambient Monitoring and Assessment Program GAMA USGS Metadata Identifier USGS:5e85e5ebe4b01d50927ea4f4 None none Common geographic areas California San Joaquin Madera Tulare Merced Fresno Kings None. Please see 'Distribution Info' for details. None. Users are advised to read the data set's metadata thoroughly to understand appropriate use and data limitations. Andrew L Soldavini U.S. Geological Survey, SOUTHWEST REGION Hydrologist mailing address

Ste 200

San Diego CA 92101-0821 US 619-225-6139 619-225-6101 asoldavini@usgs.gov This study was done in cooperation and with funding from the California State Water Resources Control Board. We especially thank the well owners who volunteered or allowed the U.S. Geological Survey to collect samples from their wells and be a part of this study and our field crews who worked hard to acquire this data. Jennifer L. Shelton Miranda S. Fram 2017 Groundwater-quality data for the Madera/Chowchilla–Kings shallow aquifer study unit, 2013–14: Results from the California GAMA Program publication n/a US Geological Survey https://doi.org/10.3133/ds1019 CARMEN A. BURTON CHRISTINE J. LAWRENCE 2020 Identification of bacteria in groundwater used for domestic supply in the southeast San Joaquin Valley, California publication No formal attribute accuracy tests were conducted No formal logical accuracy tests were conducted Data set is considered complete for the information presented, as described in the abstract. Users are advised to read the rest of the metadata record carefully for additional details. No formal positional accuracy tests were conducted No formal positional accuracy tests were conducted F. Sanger S. Nicklen A. R. Coulson 19771201 DNA sequencing with chain-terminating inhibitors publication Proceedings of the National Academy of Sciences vol. 74, issue 12 n/a Proceedings of the National Academy of Sciences ppg. 5463-5467 https://doi.org/10.1073/pnas.74.12.5463 Digital and/or Hardcopy 1977 ground condition Sanger et al. 1977 Description of methods used to produce DNA sequence data. F. Sanger A.R. Coulson 197505 A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase publication Journal of Molecular Biology vol. 94, issue 3 n/a Elsevier BV ppg. 441-448 https://doi.org/10.1016/0022-2836(75)90213-2 Digital and/or Hardcopy 1975 ground condition Sanger & Coulson 1975 Description of methods used to produce DNA sequence data. CARMEN A. BURTON CHRISTINE J. LAWRENCE 2020 Burton, C.A. and Lawrence, C.J., 2020 publication Digital and/or Hardcopy 2020 ground condition Burton, C.A. and Lawrence, C.J., 2020 All processes and methods of producing this dataset are described in this paper. Jennifer L. Shelton Miranda S. Fram 2017 Groundwater-quality data for the Madera/Chowchilla–Kings shallow aquifer study unit, 2013–14: Results from the California GAMA Program publication n/a US Geological Survey https://doi.org/10.3133/ds1019 Digital and/or Hardcopy 20130822 20140410 ground condition Shelton, J.L., and Fram, M.S., 2017 Sampling methods, quality assurance for GAMA studies, and source of data collected at during the same study. A sterile pipette was used to take cells from target and non-target colonies that grew on MI agar plates and cells were added to PCR tubes containing 25µl of PCR mix. Out of 13 agar plates with colony growth a total of 106 colonies were sampled. Multiple colonies were then sampled from a single plate but in the sequencing data given in this data release, these are not considered replicates. Samples of colonies were pipetted from individual colonies that grew on the same agar plate. Replicates samples were collected in the field and processed on a separate agar plate. Colonies sampled from these agar plates are labelled as replicates. The Fecal Indicator Bacteria csv file in the other child item of this data release will show at which sites replicate samples were collected for this data set. The region to be amplified is that of the 16S rRNA gene that codes for part of a small subunit of prokaryotic ribosomes in the 30S region by use of the 8F forward universal primer (5’-AGAGTTTGATCCTGGCTCAG-3’) and the 787R reverse universal primer (5’-CGACTACCAGGGTATCTAAT-3’). 2014 A Bio-Rad T100 Thermocycler was used to perform amplifications of the DNA by a multitiered process. Samples were first run through a denaturing process for 3 minutes at 95°C, then 30 cycles of 30 seconds at 94°C. This was followed by an annealing step for 45 seconds at 55.2°C and lastly 1 minute at 72°C with an extension of 10 min at 72°C. 2014 Lonza FlashGel was used to determine if the amplification was successful. A distinct smudge or band was the indicator of success. Samples showing this were then purified with UltraClean PCR Clean-Up Kit from Mo Bio Laboratories. 2014 Purified samples were then sent to GeneWiz Laboratories for sequencing by the Sanger method (Sanger & Coulson 1975; Sanger et al. 1977). The sequencing results were delivered in FASTA format and are given here in individual text files. Samples were sent in three shipments and results were reported from GeneWiz Laboratories as separate batches. Batches numbers are indicated in txt files. 2014 Quality assurance for this data set included positive and negative controls cultured on each lot of medium used, field, filter, and procedure blanks, and field replicates. Results and analysis of these tests can be found in the "Analysis of total coliforms, E. coli, and enterococci" section of Burton, C.A. and Lawrence, C.J., 2020 . Positive controls used for MI agar plates were E. coli and Klebsiella pneumoniae and for mEI agar plates were Enterococcus faecalis. Negative controls used for MI agar plates were Pseudomonas aeruginosa and for mEI agar plates were E. coli and Streptococcus bovis. Out of eight control sets, all positive controls had colony growth while all negative controls had no colony growth. Six field blanks were collected using sterile phosphate buffer. Filter and procedure blanks were also performed at appropriate steps in the sample procedure. Using 50mL of sterile phosphate buffer through a 0.45µm gridded membrane filter, filter blanks were performed before samples were plated and procedure blanks between each time a sample was divided and plated on either MI or mEI agars. All study blanks resulted in no colony growth. Six field replicate samples were collected and successful sequences are indicated in the included csv file. More information on sampling methods and quality assurance sampling can be found in the initial MACK data release cited in this meta data file (Shelton, J.L., and Fram, M.S., 2017). 2014 The DNA sequencing csv file contains a list of domestic sites sampled for FIBs. The file is comma delineated but can be used as row-column. Please see the "data dictionary" for column header descriptions. See data dictionary for header descriptions for included csv files. GS ScienceBase U.S. Geological Survey mailing address

Denver Federal Center, Building 810, Mail Stop 302

Denver CO 80225 United States 1-888-275-8747 sciencebase@usgs.gov Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data on any other system or for general or scientific purposes, nor shall the act of distribution constitute any such warranty. Digital Data https://doi.org/10.5066/xxxxxxxx None 20210622 Andrew L Soldavini U.S. Geological Survey, SOUTHWEST REGION Hydrologist mailing address

4165 Spruance Road

San Diego CA 92101-0821 US 619-225-6139 619-225-6101 asoldavini@usgs.gov FGDC Biological Data Profile of the Content Standard for Digital Geospatial Metadata FGDC-STD-001.1-1999