Question: emtpy bcf file
0
gravatar for Illinu
14 months ago by
Illinu90
Belgium
Illinu90 wrote:

I am aligning genomic reads to a set of 6 genes all saved in one file in fasta format. The sam and bam files look ok but the mpileup step returns empty bcf files, with only the header until:

 #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT sample1-thal.nodup.sorted.bam

This is the whole code used:

for f in /reads/*1P.fastq.gz
 do
  name=${f%%*/}
  sname=${name%%.*}
  bname=$(basename $sname)
  bwa mem thal $sname.1P.fastq.gz $sname.2P.fastq.gz > $bname-thal.sam
  samtools view -bt thal.fasta.fai $bname-thal.sam > $bname-thal.bam
  samtools sort $bname-thal.bam > $bname-thal.sorted.bam
  samtools rmdup $bname-thal.sorted.bam $bname-thal.nodup.bam
  samtools sort $bname-thal.nodup.bam > $bname-thal.nodup.sorted.bam
  samtools mpileup -uf thal.fasta $bname-thal.nodup.sorted.bam > $bname-thal.bcf
  bcftools call -mv -Ov $bname-thal.bcf > $bname-thal.vcf
done

I used this code with a different set of genes and it worked, what can be going wrong?

this is what I see with samtools tview

https://postimg.org/image/gli9yz5q7/

mpileup samtools alignment • 343 views
ADD COMMENTlink modified 14 months ago • written 14 months ago by Illinu90

two comments: * use defensive bash programming to test if something failed : http://www.davidpashley.com/articles/writing-robust-shell-scripts/ * use linux pipeline to avoid those numerous intermediate files.

ADD REPLYlink written 14 months ago by Pierre Lindenbaum106k

Thanks @Pierre, I normally run the pipeline for some commands, some of those intermediate files I need for other tasks. But in the script above I am testing this way to see at what step it fails. So far up to the *.nodup.sorted.bam it works well, that's how I know that it is the bcf file that is not being processed. So the alignment is working, not the variant calling

ADD REPLYlink modified 14 months ago • written 14 months ago by Illinu90
0
gravatar for Illinu
14 months ago by
Illinu90
Belgium
Illinu90 wrote:

This was corrected at the mpileup step with -A option

  samtools mpileup -ug -A -f thal.fasta $bname-thal.nodup.sorted.bam > $bname-thal.bcf

-A, --count-orphans Do not skip anomalous read pairs in variant calling. Maybe this is because the sequences I used as reference are short and the pairs from the sequencing come from a large fragment thereofre living many mapping reads as orphans.

ADD COMMENTlink written 14 months ago by Illinu90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1174 users visited in the last hour