Question: emtpy bcf file
0
gravatar for Illinu
3 months ago by
Illinu80
Belgium
Illinu80 wrote:

I am aligning genomic reads to a set of 6 genes all saved in one file in fasta format. The sam and bam files look ok but the mpileup step returns empty bcf files, with only the header until:

 #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT sample1-thal.nodup.sorted.bam

This is the whole code used:

for f in /reads/*1P.fastq.gz
 do
  name=${f%%*/}
  sname=${name%%.*}
  bname=$(basename $sname)
  bwa mem thal $sname.1P.fastq.gz $sname.2P.fastq.gz > $bname-thal.sam
  samtools view -bt thal.fasta.fai $bname-thal.sam > $bname-thal.bam
  samtools sort $bname-thal.bam > $bname-thal.sorted.bam
  samtools rmdup $bname-thal.sorted.bam $bname-thal.nodup.bam
  samtools sort $bname-thal.nodup.bam > $bname-thal.nodup.sorted.bam
  samtools mpileup -uf thal.fasta $bname-thal.nodup.sorted.bam > $bname-thal.bcf
  bcftools call -mv -Ov $bname-thal.bcf > $bname-thal.vcf
done

I used this code with a different set of genes and it worked, what can be going wrong?

this is what I see with samtools tview

https://postimg.org/image/gli9yz5q7/

mpileup samtools alignment • 132 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by Illinu80

two comments: * use defensive bash programming to test if something failed : http://www.davidpashley.com/articles/writing-robust-shell-scripts/ * use linux pipeline to avoid those numerous intermediate files.

ADD REPLYlink written 3 months ago by Pierre Lindenbaum94k

Thanks @Pierre, I normally run the pipeline for some commands, some of those intermediate files I need for other tasks. But in the script above I am testing this way to see at what step it fails. So far up to the *.nodup.sorted.bam it works well, that's how I know that it is the bcf file that is not being processed. So the alignment is working, not the variant calling

ADD REPLYlink modified 3 months ago • written 3 months ago by Illinu80
0
gravatar for Illinu
3 months ago by
Illinu80
Belgium
Illinu80 wrote:

This was corrected at the mpileup step with -A option

  samtools mpileup -ug -A -f thal.fasta $bname-thal.nodup.sorted.bam > $bname-thal.bcf

-A, --count-orphans Do not skip anomalous read pairs in variant calling. Maybe this is because the sequences I used as reference are short and the pairs from the sequencing come from a large fragment thereofre living many mapping reads as orphans.

ADD COMMENTlink written 3 months ago by Illinu80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 431 users visited in the last hour