Skip to main content
Researchdata.se

Metagenomic NrdJm5 sequences placed in full phylogeny

https://doi.org/10.17045/STHLMUNI.7642343
To search for sequences from metagenomics projects, we downloaded all TARA Ocean ORFs (Eren 2017, https://doi.org/10.6084/m9.figshare.4902917.v1Opens in a new tab ; Delmont 2018, https://doi.org/10.1038/s41564-018-0176-9Opens in a new tab), all ORFs from the Human Microbiome Project (2019-01-09; HMP 2012a, https://doi.org/10.1038/nature11234Opens in a new tab ; HMP 2012b, https://doi.org/10.1038/nature11209Opens in a new tab ) the majority of bacterial MAGs and SAGs from IMG/MER (4910 MAGs, 2230 SAGs) plus 53 aquatic and soil metagenomes, in particular those with project names containing “virus”, “phage”, “therm” or “hot“ (see img_sags.tsv, img_metag_samples.tsv and img_mags.tsv) (Markowitz 2008, https://doi.org/10.1093/nar/gkm869Opens in a new tab). Together, we downloaded a total of 250,881,638 ORFs. We used hmm profiles designed for each clan in the phylogeny to search the sequences. We found 181 sequences with a best match to the profile designed from the TV clan. These were aligned to the original alignment using Clustal Omega in profile mode (all.NrdJm5.co.profile.wa.masked.alnfaa ; Sievers 2014, https://doi.org/10.1038/msb.2011.75Opens in a new tab) and phylogenetically placed in the phylogeny from https://doi.org/10.17045/sthlmuni.7117430.v2Opens in a new tab with RAxML (Stamatakis 2014, https://doi.org/10.1093/bioinformatics/btu033Opens in a new tab). The resulting tree can be viewed with Dendroscope (Huson et al. 2007, https://doi.org/10.1186/1471-2105-8-460Opens in a new tab); placed sequences have "QUERY" prepended to their names.
Go to data source
Opens in a new tab
https://doi.org/10.17045/STHLMUNI.7642343

Citation and access

Topic and keywords

Metadata

figsharesu_en