I have two resources of .bam files. One is generated by our lab (1 sample = 1 bam). One is downloaded online (again 1 sample = 1 bam). For the downloaded samples the chromosomes are labelled: chr1, chr2, chr3 etc For our lab samples, the chromosomes are labelled: 1, 2, 3 etc. I want to generate a single VCF file of variants across all samples.
I'm using bcftools:
bcftools mpileup -Ov -f ref.fasta -b samples.txt | bcftools call -mv -o bamMge.vcf
However, I get no calls and the repeated error: [E::faidx_adjust_position] The sequence "1" was not found
I have two questions:
- Is my strategy correct (i.e. bcftools mpileup)? Do I need to incorporate extra steps (NB I also have matching gVCF files)
- How can I either (i) alter the chromosome labeling of one of the subsets of bam files, or, (ii) use a mapping file to match chromosome labels during the mpileup run?