trouble understanding what a warning means while building a .bcf file with samtools
2
0
Entering edit mode
6.4 years ago
ricfoz ▴ 100

Hello there,

I am trying to transform a .bam file with the length of a genic region into .fasta format, as I've been reading, y may have to transform it into .vcf file to see the variant calls, and then create a consensus using bcf tools.

the pipeline seems very straightforward when transforming into vcf format, first I need to run mpileup from samtools to the original .bam file, then call command with bcftools, to call the variants.

Now, my problem comes when I run the mpileup, I ran the next argument:

samtools mpileup -g -f chr19ref.fa filename.bam > desiredfile.bcf

it actually gave me a desiredfile.bcf, but on the terminal, it gave me loads of lines saying:

[E: :faidx_fetch_seq] The sequence "6" not found 

this message filled the screen, and at the end I got a .bcf file, which when converting to .vcf gave an empty document, I guess there's an issue with the conversion to .bcf, but I can't find out what it is, it has something to do with the warning message I describe, can someone help me with that?

Greetings

mpileup samtools vcf bcf bcftools • 2.2k views
ADD COMMENT
0
Entering edit mode

Hello everybody, i already solved the problem that i had. as Pierre writes lines bellow, my problem was that the column on my .bam file described the chromosome number as "6", and the .fasta reference file i provided had the "chr6" notation, thus ,sending the error described in my original post. I tried to realize the hack i was adviced, but as it seems, it is not possible (as far as i know), to create a symbolic link running a virtual machine, which was my case, so i dug out into some forums and found the next solution, which allows you to write whatever script you want in the column of the chromosome number. In this case i just added the preffix "chr", which was lacking:

samtools view -H strand.bam | sed -e 's/SN:SN:chr/' | samtools reheader - your_troubled_file_name.bam > output_file_name.bam

i hope it helps you out people with similar problems.

Cheers.

ADD REPLY
1
Entering edit mode
6.4 years ago

in your sam header I would say that there is a dictionary (lines starting with @SQ) about the chromosome "6" while your chr19ref.fa uses a different notation ("chr6").

ADD COMMENT
0
Entering edit mode

that makes a lot of sense ! ... when i started working with "samtools view" i had trouble running it because i tried writing the region like: "chr6:xxx-xxx+1", and it just ran until i changed the notation to: " 6:xxx-xxx+1"

how can i solve that? ... changing headers on the sam file would be difficult i guess, don't know, any suggestions to analogize the notations on both files?

ADD REPLY
1
Entering edit mode

an ugly hack : create a symbolic link to an alternate reference sequence: http://plindenbaum.blogspot.fr/2011/10/reference-genome-with-or-without-chr.html

ADD REPLY
0
Entering edit mode

Ugly hack for ugly trouble... fair enough!

I‘ll try that as soon as i reach my pc

Cheers!

ADD REPLY
1
Entering edit mode
6.4 years ago

It's highly likely that your BAM file does not match your fasta file. In particular, it looks like your bam file has a chromosome 6, but I'm guessing that your fasta file only has a chromosome chr19.

ADD COMMENT
0
Entering edit mode

hello, it is something like that ... but i can't sort out how to make the samtools work correctly, i have built my original .bam myself by running the "samtools view" command, then i downloaded a fasta file of the chr6 of the hg19 assembly, which i have in my working folder.

When i run my script, i give "samtools mpileup" the reference genome (which is the one i described lines ago), and it still gives the error.

for sake of clarity, in the script i described in the original post, i wrote chr19, when i should've written hg19.

To me, it seems like when creating the bcf, the .bam file is lacking some info or something, maybe i have give a certain flag when "cutting" the desired genomic region when using "samtools view Region:initial-final" ?

ADD REPLY

Login before adding your answer.

Traffic: 3176 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6