Question: Calculate RNAseq coverage of a transcript
1
gravatar for andrey.v.shubin
7 days ago by
andrey.v.shubin50 wrote:

Hi Biostars community,

I have bam files with RNA-seq results from ~1000 of single cell samples. For every transcript in each sample, I would like to calculate what percentage of spliced transcript's length is covered by aligned reads. Besides, It would be great to have this value for coding part of transcripts only, or, ideally, for every exon.

What would be the most straightforward way to do this?

Thanks a lot!

transcriptome rna-seq • 146 views
ADD COMMENTlink modified 4 days ago • written 7 days ago by andrey.v.shubin50
2

To rephrase: per sample, you want to count how many reads overlap with an exon? In that case: featureCounts. Calculating a percentage afterward shouldn't be too hard, e.g. using R/Python/....

ADD REPLYlink modified 7 days ago • written 7 days ago by WouterDeCoster24k
2

Agree with @WouterDeCoster, you can try to use featureCounts. You need to get an annototation file (GTF file), in featureCounts use isGTFAnnotationFile = TRUE (to set up your GTF file), GTF.featureType = "exon" (for read summarization), GTF.attrType = "exon_id" (to group features).

Otherwise, to quantify abundances of transcripts from RNA-Seq data, Kallisto could be an other way to do the trick (check this here https://pachterlab.github.io/kallisto/manual).

ADD REPLYlink modified 7 days ago • written 7 days ago by Bastien Hervé50
1
gravatar for andrey.v.shubin
4 days ago by
andrey.v.shubin50 wrote:

Thanks to WouterDeCoster and Bastien Herve for their answers!

I have found out that bedtools does exactly what I need:

bedtools coverage -hist -abam "your bam file" -b "your bed file" > "file with results"

ADD COMMENTlink written 4 days ago by andrey.v.shubin50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1431 users visited in the last hour