Question: Is correct finding unique reads with the flag -F 0x104?
0
gravatar for miquinhap
3.1 years ago by
miquinhap0
Brazil
miquinhap0 wrote:

Hello,

I read a lot of topics about how to find uniquely mapped reads. I mapped my PE RNA-Seq data with Bowtie2 and BWA to the genome at differents condition (-a, -k, --local for Bowtie2 and BWA-mem and BWA-sampe) to analyze which one is better. For Bowtie2, I saw that some people used to filter the alignment with the flag -F 0x100 and others with the MAPQ 30. For BWA, the most filters the alignment by using the MAPQ 1. 

However, I was doing some tests and I noticed that I can't use only the flag -F 0x100, because the mapper atribute this flag to unmapped reads too. I conclude that to get only the unique reads I should use the -F 0x104. Does anyone agree with me or noticed the same ?

For example:

$ samtools view -bf 0x4 file.bam > unmapped.bam

$ samtools view -c unmapped.bam

15678472

$ samtools view -cF 0x100 unmapped.bam

15678472

I also filtered my reads with the -q 30, but I realize that I had more uniquely reads using the flag -F 0x104 than the -q 30. Could this be due to false positives? And how is the best way to filter the reads with this too aligners? I'm intended to use the flag -F 0x104 for both.

Thanks, Michele

rna-seq bwa mapq bowtie2 flag • 1.6k views
ADD COMMENTlink modified 3.0 years ago by David Fredman910 • written 3.1 years ago by miquinhap0
2
gravatar for David Fredman
3.0 years ago by
David Fredman910
University of Bergen, Norway
David Fredman910 wrote:

 

To retain only uniquely mapped reads mapped with BWA one can filter by the BWA XT flag value U for unique. I did not find a simple way to do this with samtools or bamtools, so grep to the rescue:

samtools view reads.bam | grep 'XT:A:U' | \

samtools view -bS -T referenceSequence.fa - > reads.uniqueMap.bam

You could do a diff of sam output from this, and that produced by your samtools filter, to check if it's correct. 

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by David Fredman910
0
gravatar for Ian
3.1 years ago by
Ian4.9k
University of Manchester, UK
Ian4.9k wrote:

If your aim is to perform RNA-seq analysis, perhaps you should be using Tophat or RNA-STAR?  A certain amount of mapping ambiguity is expected in RNA-seq analysis.  A colleague of mine uses uniquely mapped reads when looking for novel transcripts.  RNA-STAR can be run to explicitly find uniquely mapped reads.

Also something to consider is whether you final set of mapped reads contains properly-paired reads, which can be achieved using:

samtools view -f 2

 

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Ian4.9k

I don't use Tophat because I'm studying trypanosomatids and they don't have introns. About RNA-Star i'm going to have a look. I saw some comparison studies saying that the best one for PE is BWA-mem. But I still don't know the best way to filter my alignment. I'm already not conviced about it, but i'm using the flag -F 0x104.

ADD REPLYlink written 3.0 years ago by miquinhap0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1448 users visited in the last hour