Question: Majority consensus from bam file
1
gravatar for erneheay
14 months ago by
erneheay10
erneheay10 wrote:

Hi,

I am trying to obtain a strict majority consensus from a bam file, i.e. Using only the reads aligned and NOT the reference used to align. So for example, in the case1, because the ref is completely covered by the reads, the consensus will have the same length than the reference, without gaps:

ex1

But for the 2nd case, I expect to have N's or a shorter consensus. In this case, using the reference to fill the gaps would be ok.

ex2

I have tried with samtools/bcftools and gatk but it is not working properly, i.e. it uses variations from the reads with low frequencies.

Many thanks,

E.

Edit1.

The samtools commands:

bwa mem -B 20 -A 3 -O 30 -E 3  -t 4 ref.fa R1.fastq R2.fastq > file.sam
samtools view -bT ref.fa file.sam | samtools sort > file.bam
samtools index file.bam
samtools mpileup -uf ref.fa file.bam | bcftools call -mv -Oz -o output.vcf
cat ref.fa | bcftools consensus output.vcf > output.fa

The gatk ones (from the bwa/samtools bam output):

java -jar picard.jar MarkDuplicates INPUT=file.bam O=file.dup.bam M=file.metrics
gatk HaplotypeCaller --reference ref.fa --input file.dup.bam --output gatk.vcf
gatk FastaAlternateReferenceMaker -R ref.fa --variant gatk.vcf -o gatk.fa
alignment consensus • 977 views
ADD COMMENTlink modified 14 months ago by trausch1.4k • written 14 months ago by erneheay10
1

Hello,

I have tried with samtools/bcftools and gatk but it is not working properly

please be more specific of what have you done. What were the exact command?

Usually you first have to do variant calling to obtain a vcf file. After that you can use bcftools consensus to get the consensus sequence.

fin swimmer

ADD REPLYlink written 14 months ago by finswimmer13k

Thanks finswimmer, question edited. bcftools consensus output is not a strict majority consensus, I think.

ADD REPLYlink written 14 months ago by erneheay10

Hello again,

now I understand it better. What you could try is:

  • get depth at each position of the reference using samtools depth
  • extract those regions where the depth is 0 (or any other value you like) and make a bed file out of it
  • mask those regions using bedtools maskfasta
  • use bcftools consensus on that new reference to incorporate your variants

fin swimmer

ADD REPLYlink written 14 months ago by finswimmer13k
1
gravatar for trausch
14 months ago by
trausch1.4k
Germany
trausch1.4k wrote:

This should work using Alfred

./alfred consensus -f bam -t ont -p chr1:218992200 <ont_pacbio_illumina.bam>

The alignment scoring depends on the sequencing technology. If you have phased SNPs and long reads you should split the BAMs into haplotypes prior to the consensus computation.

ADD COMMENTlink written 14 months ago by trausch1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1293 users visited in the last hour