High percentage of discordant reads
0
0
Entering edit mode
3.8 years ago
Domu • 0

I am working with Pseudomonas aeruginosa PA14 genomes. I aligned my Illumina reads to the reference genome using bowtie2 and when I look at the stats I get something like the following:

450261 reads; of these:
  450261 (100.00%) were paired; of these:
    191037 (42.43%) aligned concordantly 0 times
    255656 (56.78%) aligned concordantly exactly 1 time
    3568 (0.79%) aligned concordantly >1 times
    ----
    191037 pairs aligned concordantly 0 times; of these:
      183325 (95.96%) aligned discordantly 1 time
    ----
    7712 pairs aligned 0 times concordantly or discordantly; of these:
      15424 mates make up the pairs; of these:
        5469 (35.46%) aligned 0 times
        3976 (25.78%) aligned exactly 1 time
        5979 (38.76%) aligned >1 times
99.39% overall alignment rate

I have read that concordant and discordant alignments are used to detect INDELS, however, 40% reads aligning discordantly seems very high. Can someone tell me what is going on?

Thanks in advance!

alignment • 1.3k views
ADD COMMENT
0
Entering edit mode

Did you trim the paired end data files independently? If so they are likely out of sync which will lead to highly discordant alignments.

ADD REPLY
0
Entering edit mode

No, I trimmed them in paired end mode of Trimmomatic

ADD REPLY
0
Entering edit mode

If you are willing, try aligning with bbmap.sh from BBMap suite and see what the stats say.

bbmap.sh -Xmx10g in1=file_R1.fq.gz in2=file_R2.fq.gz ref=PA14_genome.fa

should be all you need. Stats are written to STDERR.

ADD REPLY
0
Entering edit mode

Hey, thanks! I will look into this :)

ADD REPLY

Login before adding your answer.

Traffic: 2682 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6