RNA Seq: Outlier gene counts
Entering edit mode
2.8 years ago
camppatrick ▴ 10


After using Minimap2 to align my mRNA/(cDNA) transcripts to the GENCODE human genome and counting with featureCounts to the GENCODE gtf file, I have a ridiculously large count of MALAT1 genes for every sample (up to 5% of all reads are this gene, ~5-25K counts/gene non-normalised, sequenced with Nanopore minION).

My sample is bulk rna from resected brain tissue. This gene is apparently up regulated in cancer, so it makes just enough sense to see a lot of counts. But this many counts? I do not know.

Do you all have any experience in dealing with something like this and can give me some tips on how to see if I should exclude this gene or see if this is an error somewhere in my pipeline.

Many Thanks

RNA-Seq • 454 views
Entering edit mode
2.8 years ago
predeus ★ 1.9k

MALAT1 is a known super-expressed gene. You see tons of it in single cell RNA-seq as well as in bulk.

It's not unusual to get genes like this, depending on the tissue. Sometimes you get lots of mito transcripts (e.g. in muscle), sometimes it's something else. Blood is dominated with globins (there are special kits for globin depletion).

I don't think there's error in your pipeline, you should be able to analyse the dataset in a normal way.


Login before adding your answer.

Traffic: 2397 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6