Skip to main content
Researchdata.se

Targeted sequencing of 252 genes based on their relevance in lymphoid malignancies

https://doi.org/10.17044/SCILIFELAB.19721998
Dataset description Data consists of CRAM file from capture-based gene panel sequencing  (Twist Bioscience) of 252 genes selected based on their relevance in lymphoid malignancies. The panel also included genome-wide backbone probes for copy-number analysis. The preprared libraries were then subsequenlty equenced in paired-end mode (2x150bp) on the Illumina NovaSeq 6000 (Illumina Inc.). BALSAMIC was used to analyze the FASTQ files and aligning them to reference genome. Trimmed reads were mapped to the reference genome hg19 using BWA MEM v0.7.15 4. The resulting SAM files were converted to BAM files and sorted using samtools v1.6. Duplicated reads were marked using Picard tools MarkDuplicate v2.17.0. And finally converted to CRAM files using samtools v1.6. Note: CRAM is a sequencing read file format that is highly space efficient by using reference-based compression of sequence data and offers both lossless and lossy modes of compression: https://www.ebi.ac.uk/ena/cramOpens in a new tab Data Access Statement The data is under restricted access and can be accessed upon request through the email-adress below. The targeted sequence datasets are only to be used for research aimed at advancing the understanding of genetic factors in the chronic lymphocytic leukemia. Applications aimed at method development including bioinformatics would not be considered as acceptable for use of this dataset.
Go to data source
Opens in a new tab
https://doi.org/10.17044/SCILIFELAB.19721998

Citation and access

Topic and keywords

Relations

Metadata

scilifelab
Karolinska Institutet