I would like to create a consensus sequence of a bacterium starting from the fastq files generated with Illumina. I proceeded as follows:
1. Remove poor quality sequences with trimmomatic (SLIDINGWINDOW:4:30) 2. Align with BWA MEM2 (piped with `samtools view -Sb` and `samtools sort`) 3. Indexed the alignment with `samtools index` 4. Deduplicated the sequence with sambamba markdup 5. Indexed again the alignment with `samtools index` 6. Sorted with `sambamba sort` 7. Finally created the consensus with `samtools consensus -o <outputfile>.fa -f fasta -a -l 80 <input_aligned-sorted-deduplicated-sortedAgain>.bam
However, a first bad omen is that the header of the consensus is that of the reference genome, which I did not use anywhere else in the process. The second omen is that the dotplot between the consensus and the reference shows 100% identity, whereas there should be some mutations in my sequences.
I think that, along the line, the pipe has simply pasted the reference sequence as the consensus.
Where did I get it wrong and how can I produce a consensus sequence from fastq files?