Finding The Reads Mapped Outside The Target Region
2
0
Entering edit mode
11.0 years ago
ramsom • 0

Hi,

I need some help in solving this puzzle which probably is trivial for most of the members/users.

I am working with NGS data from a targeted region. when I looked at the total number of reads that mapped to the target regions it accounts for only 40-50% of the total mapped reads. so I was wondering about where the rest of the reads mapped on to genome but I don't know how to check this. I would really appreciate if any of you know how to find the reads and regions that are out side of the target sequence.

Thanks for reading. Ram

target • 7.9k views
ADD COMMENT
0
Entering edit mode

Hi ramsom. More information would be better if you want people to help you. Tell us about what you are trying to do, the organism, what program you used to map, the format you are working with, and any other pertinent information that will help you get a useful answer. Cheers

ADD REPLY
4
Entering edit mode
11.0 years ago

If you just want to know which reads map outside your target regions, you can use BEDOPS bedops and its --not-element-of operator, along with BEDOPS bam2bed and sort-bed.

For example, if your reads are in BAM format and your regions-of-interest are in UCSC BED (of uncertain sort order), then you can do something like the following to get a fast answer:

$ sort-bed unsortedRegions.bed > sortedRegions.bed
$ bam2bed < reads.bam \
    | bedops --not-element-of -100% - sortedRegions.bed \
    > answer.bed

The answer.bed file contains reads that entirely do not overlap regions in sortedRegions.bed.

There's no need to specify the bounds of a specific genome, I think, as your reads should already have the coordinates you need to tell you where they are.

ADD COMMENT
0
Entering edit mode

great! thanks lot. wondering if I can specify read mapping quality with bedops to find out how many of good quality reads are mapped to else where in the genome.

ADD REPLY
0
Entering edit mode

Not with bedops, no. But you could use the -q option with samtools to filter reads.bam by some quality threshold. You'd then run that filtered BAM file through the above-mentioned process to get an answer.

ADD REPLY
0
Entering edit mode

Hi Alex, I was wondering if there is a way to specify --not-element-of operator to add +100bp in addition to the 100% of the regions in my sortedRegions.bed file. I could simply make another bed file by -&+ 100bp to the start and stop locations, but thought I will check with you to see if there is a easy way around. Thanks again for your help.

ADD REPLY
1
Entering edit mode

Take a look at the bedops --range L:R and --range S options to look at padding your elements asymmetrically or symmetrically. Use this in conjunction with the --everything operator to just output padded elements.

For example, you could specify bedops --range 100 --everything on the regions. Using a positive value to this argument increases the size of all elements 200 bases: 100 upstream and 100 downstream (bordered by 0 on the start coordinate). You can then use the padded result in the downstream bedops --not-element-of operation:

$ sort-bed unsortedRegions.bed \
    | bedops --range 100 --everything - \
    > paddedSortedRegions.bed

$ bam2bed < reads.bam \
    | bedops --not-element-of -100% - paddedSortedRegions.bed \
    > answer.bed
ADD REPLY
3
Entering edit mode
11.0 years ago

use bedtools "complementBed" to get the complement of your targets: http://code.google.com/p/bedtools/wiki/UsageAdvanced

$ complementBed -i probes.500bp.bed -g hg18.genome > probes.500bp.complement.bed

use the new BED file file and samtools view -L complement.bed the.bam to extract the reads in the complement.bed.

ADD COMMENT
5
Entering edit mode

can't you just use bedtools intersect -v -abam the.bam -b probes.bed ?

ADD REPLY
0
Entering edit mode

yes ! I didn't know the "-v "option :-)

ADD REPLY

Login before adding your answer.

Traffic: 2578 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6