Extract reads that are contained entirely within a give region
1
0
Entering edit mode
3.6 years ago

This seems like a very simple task to do and I'd assume that samtools view would be able to do this. Basically all I want to do is to return all reads that a entirely contained within a given region.

samtools view bamfile "chr1:100-1000" returns any read the overlaps that region. For example a read may start before 100 or extend beyond 1000. I only what those that map entirely inside that region.

RNA-Seq samtools • 626 views
ADD COMMENT
1
Entering edit mode
3.6 years ago

Use bedops and convert2bed (bam2bed) with a process substitution that specifies your region of interest:

$ bam2bed < reads.bam | bedops -e 100% - <(echo -e "chr1\t100\t1000") > answer.bed

Using bedops -e 100% asks for reads from read.bam that are contained entirely within chr1:100-1000.

Any reads that are found are written to answer.bed.

For RNA-seq data, you might wish to use the --split operand with bam2bed. See the documentation for more information.

References:

ADD COMMENT

Login before adding your answer.

Traffic: 2643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6