lncRNAs in hematopoiesis

Lineage tree

A substantial fraction of the genome is transcribed in a cell type-specific manner, producing long non-coding RNAs (lncRNA) rather than protein-coding RNAs. We systematically characterized the transcriptome dynamics (both mRNA and lncRNA) during normal hematopoietic differentiation and hematological malignancies. We found lncRNAs to be regulated during differentiation and misregulated in disease. This page is a resource for easy exploration of lncRNA expression data and access to their genomic locations.

Links to the raw data and the published work can be found in the Downloads tab.

For more information, please visit our publication: Delás MJ, Sabin LR, Dolzhenko E et al (2017) lncRNA requirements for mouse acute myeloid leukemia and normal differentiation. eLife 6: e25607

Select lncRNAs from the diagram below or search the table. Selected lncRNAs will appear in the Plot and Heatmap tabs.

The coordinates will open a genome browser session with the lncRNA catalog and RNAseq coverage from this publication and the ATACseq data from Lara-Astiaso, Weiner et al. (2014)

MEP AML leukemia-enriched lncRNAs LT-HSC CLP ProB CD3-T PreB GMP Gr1 CMP Lymphoid-enriched lncRNAs HSC-enriched lncRNAs Progenitor-enriched lncRNAs
Select some lncRNAs in the table tab.
Select some lncRNAs in the table tab.


- RNAseq libraries were mapped with STAR aligner (Dobin et al., 2013) against the mm10 mouse genome assembly using default parameters.
- Duplicate alignments were removed from the resulting BAM files with Picard (http://broadinstitute.github.io/picard).
- Transcriptome assembly was performed individually for each library with cufflinks (Trapnell et al., 2010) utilizing GENCODE Release M4 annotations. Then the assemblies were filtered by removing all transcripts from non-core chromosomes (such as chromosomes corresponding to partially-assembled contigs and mitochondria).
- Individual transcriptome assemblies were merged with program cuffmerge (Trapnell et al., 2010) ran with default parameters.
- The resulting merged assembly was filtered by removing transcripts (a) consisting of a single exon or spanning fewer than 200 bp, (b) overlapping a coding exon in the same orientation, (c) having FPKM below 0.3, (d) having at least one exon supported by fewer than 40 reads in each library, (e) overlapping genes annotated as IG*, (f) having a coding probability estimated by CPAT below 0.5 (Wang et al., 2013).
- We required that an intron-exon structure of a transcript was supported in at least two libraries. The intron-exon structure similarity of two transcripts was measured using Jacquard index of genomic intervals defined by their introns. A Jaccard index cutoff of 0.2 was used. The length of the terminal exon is often increases after multiple transcriptome assemblies are merged and the Jaccard index gives as a measure that is agnostic to the length of the terminal exon.
- In order to calculate the number of fragments mapping to each gene in each library, the overlapping catalog transcripts were merged together. When overlapping transcripts were assigned distinct gene names, the name of the "merged" gene consisted of the list of genes corresponding to each transcript.
- Counting of fragments corresponding to each merged annotation was performed was performed with the program htseq-counts (Anders et al., 2015).

Catalog construction

The raw expression data can be downloaded from GEO: accession GSE90067

The SuperSeries linked to this publication also includes the high throughput sequencing readout from an in vivo shRNA screen that identified functional lncRNAs in AML and transcriptome analysis upon lncRNA knockdown: GEO GSE90072

For more information, please visit our publication: Delás MJ, Sabin LR, Dolzhenko E et al (2017) lncRNA requirements for mouse acute myeloid leukemia and normal differentiation. eLife 6: e25607.

For reagent requests please contact the Hannon Lab. For further information on the lncRNA catalog, please contact Smith Lab.