Question: trouble understanding what a warning means while building a .bcf file with samtools
gravatar for ricfoz
2.6 years ago by
National School of Antropology and History, Mexico city, Mexico
ricfoz30 wrote:

Hello there,

I am trying to transform a .bam file with the lenght of a genic region into .fasta format, as i've been reading, y may have to transform it into .vcf file to see the variant calls, and then create a consensus using bcf tools.

the pipeline seems very straightforward when transforming into vcf format, first i need to run mpileup from samtools to the original .bam file, then call command with bcftools, to call the variants.

Now, my problem comes when i run the mpileup, i ran the next argument:

samtools mpileup -g -f chr19ref.fa filename.bam > desiredfile.bcf

it actually gave me a desidedfile.bcf , but on the terminal, it gave me loads of lines saying:

[E: :faidx_fetch_seq] The sequence "6" not found

this message filled the screen, and at the end i got a .bcf file, which when converting to .vcf gave an empty document, i guess there's an issue with the conversion to .bcf, but i can't find out what it is, it has something to do with the warning message i describe, can someone help me with that?


bcftools mpileup samtools .bcf .vcf • 1.0k views
ADD COMMENTlink modified 2.6 years ago by Devon Ryan95k • written 2.6 years ago by ricfoz30

Hello everybody, i already solved the problem that i had. as Pierre writes lines bellow, my problem was that the column on my .bam file described the chromosome number as "6", and the .fasta reference file i provided had the "chr6" notation, thus ,sending the error described in my original post. I tried to realize the hack i was adviced, but as it seems, it is not possible (as far as i know), to create a symbolic link running a virtual machine, which was my case, so i dug out into some forums and found the next solution, which allows you to write whatever script you want in the column of the chromosome number. In this case i just added the preffix "chr", which was lacking:

samtools view -H strand.bam | sed -e 's/SN:SN:chr/' | samtools reheader - your_troubled_file_name.bam > output_file_name.bam

i hope it helps you out people with similar problems.


ADD REPLYlink written 2.6 years ago by ricfoz30
gravatar for Pierre Lindenbaum
2.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

in your sam header I would say that there is a dictionary (lines starting with @SQ) about the chromosome "6" while your chr19ref.fa uses a different notation ("chr6").

ADD COMMENTlink written 2.6 years ago by Pierre Lindenbaum129k

that makes a lot of sense ! ... when i started working with "samtools view" i had trouble running it because i tried writing the region like: "chr6:xxx-xxx+1", and it just ran until i changed the notation to: " 6:xxx-xxx+1"

how can i solve that? ... changing headers on the sam file would be difficult i guess, don't know, any suggestions to analogize the notations on both files?

ADD REPLYlink written 2.6 years ago by ricfoz30

an ugly hack : create a symbolic link to an alternate reference sequence:

ADD REPLYlink written 2.6 years ago by Pierre Lindenbaum129k

Ugly hack for ugly trouble... fair enough!

I‘ll try that as soon as i reach my pc


ADD REPLYlink written 2.6 years ago by ricfoz30
gravatar for Devon Ryan
2.6 years ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

It's highly likely that your BAM file does not match your fasta file. In particular, it looks like your bam file has a chromosome 6, but I'm guessing that your fasta file only has a chromosome chr19.

ADD COMMENTlink written 2.6 years ago by Devon Ryan95k

hello, it is something like that ... but i can't sort out how to make the samtools work correctly, i have built my original .bam myself by running the "samtools view" command, then i downloaded a fasta file of the chr6 of the hg19 assembly, which i have in my working folder.

When i run my script, i give "samtools mpileup" the reference genome (which is the one i described lines ago), and it still gives the error.

for sake of clarity, in the script i described in the original post, i wrote chr19, when i should've written hg19.

To me, it seems like when creating the bcf, the .bam file is lacking some info or something, maybe i have give a certain flag when "cutting" the desired genomic region when using "samtools view Region:initial-final" ?

ADD REPLYlink written 2.6 years ago by ricfoz30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 767 users visited in the last hour