Question: bcftools mpileup no output (empty calls.bcf file)
0
gravatar for RobertUt
4 days ago by
RobertUt0
RobertUt0 wrote:

Hi everyone,

I am trying to make a consensus from a aligned bam file. This is my workflow:

bcftools mpileup -Ou -f reference_HG19.fa aligned.bam | bcftools call -mv -Ob -o calls.bcf

bcftools index calls.bcf

cat reference_HG19.fa | bcftools consensus calls.bcf > consensus.fa

However, the created consensus.fa file turned out to be identical with the reference_HG19.fa file. I looked at calls.bcf file with bcftools view, and it seems the following columns are all empty:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT aligned.bam

When I look at calls.bcf with samtools view I get the error "Aborted".

Does anyone have an idea how to solve this issues? Thank you very much!

Robert

ADD COMMENTlink modified 8 hours ago • written 4 days ago by RobertUt0

Hello and welcome to biostars RobertUt ,

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you.

Are you sure that you will have any variants in your aligned.bam?

When I look at calls.bcf with samtools view I get the error "Aborted".

AFAIK samtools view cannot read bcf files. bcftools view would be what you are looking for.

fin swimmer

ADD REPLYlink written 4 days ago by finswimmer7.9k

Dear fin,

Thank's for your suggestions.

I just checked the bam file with samtools tview and found plenty of variants.

By the way, the only warning/error I get is about ploidy and a chromosome, which is missing in my bam file but present in the reference fasta.

Any other suggestions?

ADD REPLYlink written 4 days ago by RobertUt0

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLYlink written 4 days ago by genomax59k

How did you produce your bam file? Please show us the exact command used.

ADD REPLYlink written 4 days ago by finswimmer7.9k

The bam file wasn't generated in our on lab but was downloaded from:

https://www.ebi.ac.uk/ena/data/view/PRJEB3371

It's a whole genome sequencing file, generated with Illumina HiSeq 2000. I am also suspecting the bam file to be the problem, because the fasta worked fine in previous commands. Chromosome notation is consistent between bam and fasta.

ADD REPLYlink written 4 days ago by RobertUt0

To which genome was the BAM aligned (check header)? Which genome ref are you using?

Just to humour me, could you try:

bcftools mpileup -Oz -f reference_HG19.fa aligned.bam | bcftools call -mv -Ob -o calls.bcf

...and:

bcftools mpileup -f reference_HG19.fa aligned.bam | head
ADD REPLYlink written 4 days ago by Kevin Blighe33k

Hello,

you have to make sure that the bam files are coordinate sorted before doing variant calling. At least the first sample isn't.

fin swimmer

ADD REPLYlink written 3 days ago by finswimmer7.9k

Hi fin,

the bam file I'm using is coordinate sorted.

Robert

ADD REPLYlink written 1 day ago by RobertUt0
0
gravatar for RobertUt
8 hours ago by
RobertUt0
RobertUt0 wrote:

We finally found the solution: The downloaded bam was generated by paired end sequencing. For some unknown reason bcftools considers these read pairs to be "anomalous". By using the -A input option one prevents bcftools from skipping anomalous read pairs. Now it's working as it is supposed to.

ADD COMMENTlink written 8 hours ago by RobertUt0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1492 users visited in the last hour