i am very new to bioinformatics and command line tools in genomics. I want to use Samtools to identify mutations in a genome. I have the tumor genome (as bam file) , the control genome(as bam file) and the reference genome(in fasta format). How can I use all the three to find mutation in the genome and get information like location, genotype of the mutation? Thanks in advance.
Samtools has the module you're looking for, it's called
mpileup. It wants a reference genome in fasta format and 1 or more mapping results in bam format. It will create a VCF file with the putative variants of the 1 or more bam files provided with respect to the reference genome.
The VCF file is quite complex as a format, I strongly suggest you to read the format specifications carefully (spend 1 day getting comfy with it).
samtools mpileup the work is not done, you have to
call the variants. For that, I would suggest you to use
bcftools call. After calling, you can use the many values reported in the VCF file to filter the ones you really need.