Skip to main content

Bulk RNA sequencing of erythroblasts from a pair of SF3B1-mutated and SF3B1-wildtype induced pluripotent stem cell (iPSC) lines

https://doi.org/10.48723/3hs1-0v44
This dataset consists of bulk RNA sequencing data of MACS-separated GPA+ erythroblasts obtained from a pair of induced pluripotent stem cell (iPSC) lines with and without SF3B1-mutation, generated from an MDS patient (Asimomitis G et al. 2022, Blood Advances). The objective of this data collection was to assess how SF3B1 mutation changes the molecular profile of RNA splicing in erythropoiesis. This dataset includes minimally processed, visualisation-ready .bam format sequencing data for both of the lines. Processing: MDS patient iPSC line-derived hematopoietic stem and progenitor cells (HSPC) were cultured for 14 days in erythroid specification media (StemPro-34 SFM [Gibco] + 1% Pen/Strep [Cytiva], 2 mM L-glutamine [Sigma-Aldrich], 3.5 µM 1-Thioglycerol [Sigma-Aldrich], 1% Bovine Albumin Fraction V [Gibco], 150 µg/mL holo-transferrin [Sigma-Aldrich], 2 U/mL erythropoietin [Pfizer], 50 ng/mL Stem Cell Factor [PeproTech] and 50 ng/mL interleukin-3 [PeproTech]). At Day 14 of culture, mixed glycophorin A-positive (GPA+) erythroblast samples were isolated through MACS. Cells were lysed in RLT (Qiagen) + 40 mM dithiothreitol (Sigma-Aldrich) and RNA extraction was performed with RNeasy Micro Kit (Qiagen) with RNase-free DNase treatment according to the manufacturer’s protocol. RNA integrity numbers (RIN) were estimated using Agilent RNA 6000 Pico Kits (Agilent Technologies, CA, USA). A minimum RIN value of 6.5 was considered adequate. RNA sequencing (RNAseq) libraries were prepared from total RNA using SMARTer Stranded Total RNA-Seq Kits v2 - Pico Input Mammalian (Takara Bio, Japan), including enzymatic ribosomal depletion steps. Libraries were sequenced using an Illumina Novaseq 6000 S4 (Illumina, CA, USA) with paired-end 150bp configuration. Reads were pre-processed with TrimGalore v. 0.6.7 using CutAdapt v. 3.5 and BAM files were generated through via two-pass alignment with STAR v. 2.7.9a against the GRCh38.p13 human genome assembly. The dataset consists of 13 files: - 2 .bam files, one for the SF3B1-mutant sample and one for the wildtype sample; - 2. bai bam index files, one for each sample to facilitate analysis of the .bam files. - 8 .fastq raw data files, corresponding to a paired-end run of the two samples in two different lanes (2 x 2 x 2). - 1 gene-collapsed read count matrix (.txt) summarising read counts for both samples. The documentation file iPSCEB_FileList.txt contains a full list of the files in the dataset. The total size of the dataset is approximately 80 GB.

Citation and access

Method and outcome

Samples/material Existing from scientific collection/biobank

Administrative information

Topic and keywords

Relations

Publications

Contact

Metadata

doriski