Deleted:remove intronic and intergenic reads from bam files (rnaseq)
0
0
Entering edit mode
2.8 years ago
lma • 0

Hello,

I need to remove reads of intronic and intergenic origin from my bam files. I used the script split_bam.py (rseqc) and genome_annotation.gft2bed.bed file to filter out these reads. Although, most intronic and intergenic reads were removed, I'm still getting some reads from these regions (see output from qualimap). I'm not sure why I'm still getting intronic and intergenic reads (any comment?). Also, Is there another way to totally remove these reads? how can I get a bed file that helps filter out these reads?
FYI, I removed rRNA reads from these samples in a previous step.

Sample 1:
Before:
    exonic =  26,416,935 (90.23%),
    intronic = 1,082,069 (3.7%),
    intergenic = 1,779,920 (6.08%),
    overlapping exon = 1,036,788 (3.54%).
After:
    exonic =  26,416,653 (95.54%),
    intronic = 680,971 (2.46%),
    intergenic = 551,391 (1.99%),
    overlapping exon = 918,605 (3.32%).

Sample 2:
Before:
    exonic =  1,069,866 (30.54%),
    intronic = 139,044 (3.97%),
    intergenic = 2,294,436 (65.49%),
    overlapping exon = 201,608 (5.75%).
After:
    exonic =  1,069,850 (76.3%),
    intronic = 53,733 (3.83%),
    intergenic = 278,517 (19.86%),
    overlapping exon = 128,756 (9.18%).

Thanks!

intron bam intergenic rnaseq rseqc • 837 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2769 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6