Question: Difference in output VCF between running on a local PC and supercomputer cluster
0
gravatar for zhou_1228
2.3 years ago by
zhou_12280
zhou_12280 wrote:

I ran a SNP calling program on both local PC and supercomputer cluster. However, the output VCF file generated by supercomputer lack the positive mutation, which we can identify from VCF file produced by local PC. I wonder whether we need adjust some parameters or add some options in our commands when using it on supercomputer, then the mutation will shown. I would truly appreciate your answers. Below is the shell scripts of program I used,

1.first indexing the genome

bwa index -p GlycinMaxbwaidx -a bwtsw GlycinMax.fasta

2.algn

bwa aln -t 4  GlycinMaxbwaidx P1-WD04-3L-F.fastq > P1-WD04-F.bwa
bwa aln -t 4  GlycinMaxbwaidx P1-WD04-3L-R.fastq > P1-WD04-R.bwa

3.build sam file

bwa sampe GlycinMaxbwaidx P1-WD04-F.bwa P1-WD04-R.bwa P1-WD04-3L-F.fastq P1-WD04-3L-R.fastq > P1-WD04.sam

4.convert sam to bam

samtools view -S -b P1-WD04.sam > P1-WD04.bam

5.sort and index bam

samtools sort  P1-WD04.bam -o  P1-WD04.sorted.bam
samtools index P1-WD04.sorted.bam

6.varriant calling

freebayes -f GlycinMax.fasta -p 24 --use-best-n-alleles 4 --pooled-discrete P1-WD04.sorted.bam >P1-WD04.vcf
ADD COMMENTlink modified 2.3 years ago by h.mon32k • written 2.3 years ago by zhou_12280
2

are you using the version same versions of the tools ?

ADD REPLYlink written 2.3 years ago by Pierre Lindenbaum134k

Hi Pierre, thanks for your reply. Yes, I checked the versions of all three tools, BWA, samtools, and freebayes in my SNP calling program, and they are the same between local PC and supercomputer cluster. However, I noticed that the size of final VCF file are different, 19.9MB on supercomputer and 23.2MB on local PC.

ADD REPLYlink written 2.3 years ago by zhou_12280

you can pipe bwa into samtools sort ...

ADD REPLYlink written 2.3 years ago by Pierre Lindenbaum134k

Are you reads shorter than 70bp? If not, bwa mem is preferred over bwa aln.

ADD REPLYlink written 2.3 years ago by h.mon32k
0
gravatar for swbarnes2
2.3 years ago by
swbarnes29.6k
United States
swbarnes29.6k wrote:

For starters, is the vcf the first intermediate file where the supercomputer results differ from the local PC?

And as Pierre says:

Software version differences are the likely answer, and bwa samse | samtools sort works fine, because samtools sort now works on sam files

ADD COMMENTlink written 2.3 years ago by swbarnes29.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1660 users visited in the last hour
_