remove intronic and intergenic reads from bam files (rnaseq)
0
0
Entering edit mode
11 weeks ago
lma • 0

Hello,

I need to remove reads of intronic and intergenic origin from my bam files. I used the script split_bam.py (rseqc) and genome_annotation.gft2bed.bed file to filter out these reads. Although, most intronic and intergenic reads were removed, I'm still getting some reads from these regions (see output from qualimap). I'm not sure why I'm still getting intronic and intergenic reads (any comment?). Also, Is there another way to totally remove these reads? how can I get a bed file that helps filter out these reads?
FYI, I removed rRNA reads from these samples in a previous step.

Sample 1:
Before:
    exonic =  26,416,935 (90.23%),
    intronic = 1,082,069 (3.7%),
    intergenic = 1,779,920 (6.08%),
    overlapping exon = 1,036,788 (3.54%).
After:
    exonic =  26,416,653 (95.54%),
    intronic = 680,971 (2.46%),
    intergenic = 551,391 (1.99%),
    overlapping exon = 918,605 (3.32%).

Sample 2:
Before:
    exonic =  1,069,866 (30.54%),
    intronic = 139,044 (3.97%),
    intergenic = 2,294,436 (65.49%),
    overlapping exon = 201,608 (5.75%).
After:
    exonic =  1,069,850 (76.3%),
    intronic = 53,733 (3.83%),
    intergenic = 278,517 (19.86%),
    overlapping exon = 128,756 (9.18%).

Thanks!

intron bam intergenic rnaseq rseqc • 274 views
ADD COMMENT
0
Entering edit mode

why did you want to remove intronic and intergenic reads?

ADD REPLY
0
Entering edit mode

I need to remove them to do polyA trimming and reduce mapping time.

ADD REPLY

Login before adding your answer.

Traffic: 1430 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6