I am currently working with a program that generates aritificial FASTQ files when given a reference genome called Artificial FASTQ Generator. Here is a link to the description of the program and here is the manual, it says that the program generates paired-end reads (and it does generate two FASTQ files). After aligning the artificial reads (the two FASTQ files generated by Artificial FASTQ Generator) to the reference genome as paired-end reads using Bowtie 2 I got the following result:
892589 (100.00%) were paired; of these: 892585 (100.00%) aligned concordantly 0 times 4 (0.00%) aligned concordantly exactly 1 time 0 (0.00%) aligned concordantly >1 times ---- 892585 pairs aligned concordantly 0 times; of these: 870486 (97.52%) aligned discordantly 1 time ---- 22099 pairs aligned 0 times concordantly or discordantly; of these: 44198 mates make up the pairs; of these: 7154 (16.19%) aligned 0 times 15716 (35.56%) aligned exactly 1 time 21328 (48.26%) aligned >1 times 99.60% overall alignment rate
After aligning both FASTQ files as single end reads, I got the following:
1785178 (100.00%) were unpaired; of these: 7154 (0.40%) aligned 0 times 1755694 (98.35%) aligned exactly 1 time 22330 (1.25%) aligned >1 times 99.60% overall alignment rate
What I do not understand is why these reads are aligning as single-end reads and not as paired-end reads as expected? Is anybody familiar with both programs that can help explain this?
Here are the outputs:
SAM output of Bowtie 2 (single-end alignment)
Here is the output of the paired-end alignment.
SAM output of Bowtie 2 (paired-end alignment)