Gå direkt till huvudinnehåll
Researchdata.se

DNA-based monitoring of bacterial and protist diversity in the Baltic Sea

https://doi.org/10.17044/SCILIFELAB.28673273

Here we share the code, the sequencing processing output, and the intermediate data files for the work on bacterial and protist diversity patterns in the Baltic Sea area based on 16S and 18S metabarcoding as implemented two times for a year alongside the Swedish coastline monitoring programme. This work is available as a preprint: Distinct bacterial and protist plankton diversity dynamics uncovered through DNA-based monitoring in the Baltic Sea area, Krzysztof T Jurdzinski, Meike AC Latz, Anders Torstensson, Sonia Brugel, Mikael Hedblom, Yue O O Hu, Markus Lindh, Agneta Andersson, Bengt Karlson, Anders F Andersson, bioRxiv 2024.08.14.607742; doi: https://doi.org/10.1101/2024.08.14.607742Öppnas i en ny tabb Documentation files: README.md - description of the files, including all the files within the zipped folders. environment.yml - conda environment with software/packages needed to run all the included scripts. workflow.sh - a bash script defining the workflow. Zipped folders with data processing documentation and intermediate files ampliseq_16S.zip - this directory includes the scripts used to run the nf-core/ampliseq (https://nf-co.re/ampliseq/2.7.0/Öppnas i en ny tabb) pipeline on the V3-V4 16S metabarcoding samples, as well as output files needed for downstream analysis. ampliseq_18S.zip - same as ampliseq_16S.zip, but for the the V4 18S metabarcoding. taxa_reannotation.zip - each subdirectory contains results of taxonomic re-annotation of the metabarcoding results and the scripts to obtain them. Both 2015-2017 and 2019-2020 datasets were re-annotated with the GTDB corrected for mislabled sequences using SATIVA and with PR2 version 5.0.0 for 16S and 18S respectively. Both 16S datasets were re-annotated using the SILVA database (version 138.1). data_2015_2017.zip -these files correspond to the data for the samples from 2015 to 2017 (+ storage test for some 2019 samples). This is new data, later down the pipeline merged with the 2019-2020 dataset. merged_data.zip - this folder contains merged across the 2015-2017 and the 2019-2020 datasets, based on the files from folders data_2015_2017 and data_2019_2020- GSHHG.zip - Global Self-consistent, Hierarchical, High-resolution Geography Database (GSHHG) version 2.3.7 file needed to plot maps, as downloaded from the NOAA website (https://www.ngdc.noaa.gov/mgg/shorelines/gshhs.htmlÖppnas i en ny tabb) . Herlemann_et_al_2016.zip - data from the transect-based study by Herlemann et al., 2016 (https://doi.org/10.3389/fmicb.2016.01883Öppnas i en ny tabb) . read_downsampling.zip - This folder includes the scripts used to rarefy raw reads and the key output files. It is all based on 16S data. Zipped folders with key R scripts processing_code.zip - R scripts used for multiple steps of intermediate data table processing. analysis_figures_code.zip - R scripts used to analyze the data and generate the figures.

Gå till källa för data
Öppnas i en ny tabb
https://doi.org/10.17044/SCILIFELAB.28673273

Citering och åtkomst

Ämnesområde och nyckelord

Relationer

Metadata

scilifelab
Kungliga tekniska högskolan