Consensus sequence with characters apart from ATGC
1
0
Entering edit mode
5.6 years ago

Dear All,

I have aligned the reads to the reference genome using Bowtie2. The generated sam was then converted to bam and sorted. The sorted file using the following command i tried to generate consensus sequence:

samtools mpileup -uf ref.fa aligned_sorted.bam | bcftools call -c | vcfutils.pl vcf2fq > aligned.fasta

However, a fasta file is generated. Viewing the tail of the file i get the follwoing:

HEHHHHHEHEEHEBBEBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBEEEEEEHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHH
HHHHHHHHHHHHHHHEEEHHHEHHCCC@CCC@CEEEEEEEEEEECEEEEEEEEEEHHHHH
EHHHHHHHHHHHHHHHHHHHHHHHEHHKKKKKKKKKKKHEEEHECEEEEEEEECCEEEEE
EEEEEEEEEEEEEEEEHHHHHHHEKHHKKHKNNNNNQQQQNQQQNQNQQQQQQQTQTTQQ
TQQTQQQQQQNQQHNTTTTTTTTTTTQQQQNQQQQQQQQQQQQQQQQNQQQQQQQQQTQQ
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQNNNNQQQQKKKKK
KKKKKKKHHHHHHHHHHHHHHHHEEEEEEEEEEEEEEEEEEEEEEEBBBBBBBBBBBBEE
EEEEEEEEEEEEEEEEEEEEEEEEEEEECCCCCCCCCFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFCCCCCCCC

What does this mean. I expected the file to have charcaters 'ATGC'

Thanks in advance..

Regards, sam

next-gen • 1.9k views
ADD COMMENT
0
Entering edit mode

Is this base quality ?

Your last command is vcfutils.pl vcf2fq, so you get a fastq file

ADD REPLY
0
Entering edit mode

Thanks Bastien for your reply. But why is base quality printed in a consensus fasta file. I would like to use this fasta file for further analysis.

ADD REPLY
0
Entering edit mode
5.6 years ago

Hello,

if you like to have a consensu fasta you have to use bcftools consensus.

Furthermore which version of bcftools/samtools are you using. Since v1.9 samtools mpileup is deprecated and bcftools mpileup should be used instead, see:

Note that using "samtools mpileup" to generate BCF or VCF files is now
deprecated.  To output these formats, please use "bcftools mpileup" instead

fin swimmer

ADD COMMENT
2
Entering edit mode

Thanks finswimmer. i am using samtools v1.8 and bcftools v1.8. So should i use the following command to generate the consensus sequence:

bcftools mpileup -Ou -f reference.fa alignments.bam | bcftools call -mv -Oz -o calls.vcf.gz
tabix calls.vcf.gz

cat reference.fa | bcftools consensus calls.vcf.gz > consensus.fa

Thanks!

ADD REPLY
0
Entering edit mode

Or do you have the better way to do the same?

ADD REPLY
0
Entering edit mode

Looks fine to me.

You could think about to output bcftools call to compressed bcf and indexing with bcftools index.

I prefer to use the -f option instead of piping the reference to bcftools consensus.

fin swimmer

ADD REPLY
0
Entering edit mode

Thanks Fin for your help!

ADD REPLY

Login before adding your answer.

Traffic: 2339 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6