Question: Generating consensus sequence from bam file
0
gravatar for chparada
11 weeks ago by
chparada0
chparada0 wrote:

Hi,

I am trying to generate consensus sequence from a bam file obtained after mapping SRA reads to a reference genome.

I used the following commands:

bwa mem ref.fasta SRR_1.fastq SRR_2.fastq > bwa.sam
samtools view -b -F 4 bwa.sam > bwa_aligned.bam
samtools index bwa_aligned.bam

I am not sure how to generate the consensus sequence that I have in mind. In case I don't explain this well. I made a diagram:

===========================================================>ref.fasta
- -- ---- ----      ----- --- --- -      --    ------ ----- 
------ --- ---     -------- --- --       ---- -      --- -->SRR_reads_mapping



==============  +  ================  +   ==================> consensus_sequence.fasta

Please let me know if you have any advice on this.

Cheers!!!

bwa samtools fasta genome • 363 views
ADD COMMENTlink modified 13 days ago by guillaume.rbt560 • written 11 weeks ago by chparada0

What do you mean by consensus sequence? The most frequent nucleotide in each position? The variants? There are plenty of tools that can give you the reading in each position, try bcftools mpileup for instance.

ADD REPLYlink written 11 weeks ago by Asaf5.6k

Thank you for your answers.

I meant obtaining the most frequent nucleotide in each position. From the mapped reads, I want to been able to obtain a consensus sequence in single fasta file. I am going to try bcftools mpileup.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by chparada0
2
gravatar for nsmi8446
11 weeks ago by
nsmi8446120
nsmi8446120 wrote:

You could call variants (using whatever variant calling software you like, GATK, freebayes etc.) from your .bam file and then use vcf-consensus (http://vcftools.sourceforge.net/perl_module.html#vcf-consensus) to build your consensus sequence. The code below should work:

cat ref.fa | vcf-consensus file.vcf.gz > out.fa

ADD COMMENTlink written 11 weeks ago by nsmi8446120
1

I agree with nsmi8446. This is a nice concise way to solve your problem.

ADD REPLYlink written 11 weeks ago by cfos4698130
0
gravatar for waldeyr
18 days ago by
waldeyr0
waldeyr0 wrote:
samtools mpileup -uf my_reference.fna my_file.bam | bcftools view -cg - | vcfutils.pl vcf2fq > my_consensus.fq
ADD COMMENTlink written 18 days ago by waldeyr0
0
gravatar for trausch
18 days ago by
trausch1.2k
Germany
trausch1.2k wrote:

Alfred has a consensus mode that extracts all reads at a given alignment position and then runs a multiple sequence alignment computation with consensus generation. It's primarily for long reads but I think it also works for short reads.

alfred consensus -t ill -f bam -p chr4:500500 input.bam
ADD COMMENTlink written 18 days ago by trausch1.2k
0
gravatar for chen
13 days ago by
chen1.9k
OpenGene
chen1.9k wrote:

try gencore: https://github.com/OpenGene/gencore

Generate consensus reads to reduce sequencing noises and remove duplications

ADD COMMENTlink written 13 days ago by chen1.9k
0
gravatar for guillaume.rbt
13 days ago by
guillaume.rbt560
France
guillaume.rbt560 wrote:

you can use GATK FastaAlternateReferenceMaker to generate the consensus sequence based on a SNP calling : https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_fasta_FastaAlternateReferenceMaker.php

ADD COMMENTlink written 13 days ago by guillaume.rbt560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 814 users visited in the last hour