How to remove non protein coding genes from single cell pre-mRNA seq?
1
0
Entering edit mode
3.9 years ago
fifty_fifty ▴ 60

I have an output from cellranger with gene expression for each cell (pre-mRNA seq). I need to remove all non-coding gene from there. What is the best way to do it?

RNA-Seq scRNA cellranger • 2.1k views
ADD COMMENT
0
Entering edit mode

What is pre-mRNA seq? Do you mean total RNA seq?

ADD REPLY
0
Entering edit mode

pre-mRNA is a primary transcript that contains both introns and exons

ADD REPLY
0
Entering edit mode

I know, but how can you only sequence pre-mRNA? Capping and poly-adenylation is happening after transcription, but splicing is already happening during transcription, so I'm genuinely curious, because I never heard of pre-mRNA seq.

ADD REPLY
0
Entering edit mode
3.9 years ago
caggtaagtat ★ 1.9k

If you want to subset the bam file you can do:

samtools view -h -b -L genes_coordinates.bed in.bam > out.bam

where genes_coordinates.bed is a bed file with the genomic coordinates of each coding gene. This information could for example be downloaded from ensembl (biomart).

ADD COMMENT
0
Entering edit mode

Does that respect overlapping genes on different strands?

ADD REPLY
0
Entering edit mode

This would be independent of strand, so there would still be some reads left which do not belong to mRNA, but e.g. lnRNA transcribed from the other strand.

ADD REPLY
0
Entering edit mode

I would simply make the count matrix, get a GTF file and keep only what is annotated as protein_coding.

ADD REPLY
0
Entering edit mode

How to get a GTF file from cellranger output?

ADD REPLY
0
Entering edit mode

You have to download it from any of the standard repositories, e.g. GENCODE, NCBI. it is a reference annotation file. Do you have a background in NGS analysis? No offense but if you are stuck with removing some genes than the single-cell analysis will be..."fun". But seriously, you should spend some time with the basics.

ADD REPLY
0
Entering edit mode

I do not really have a biological background. But I am learning. So, I thought I could ask questions here to get a better understanding of things like that. Thank you for your suggestion though

ADD REPLY
0
Entering edit mode

Yeah, when the goal is a count matrix, that makes more sense.

ADD REPLY
0
Entering edit mode

thank you. Is there any way to use .h5 file instead?

ADD REPLY

Login before adding your answer.

Traffic: 2698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6