I have whole genome sequences of 300 bacterial isolates and i want to do the gene presence absence analysis. After quality and adapter trimming, I have been able to generate alignment bam files from the reference genome for all isolates. One of the options for looking at gene presence absence is to do pangenome analysis, but i believe for that one needs whole genome fasta files. I wanted to know how a single fasta file with the consensus sequence can be generated from sam/bam files so that it can be used in a pangenome analysis tool like ROARY. I am new to NGS analysis and I will be very happy to try other approaches or options out there. Thank you for reading.
This doesn’t seem like the most sensible way to do this.
I would do the following: