Variant Calling in fishes
3 months ago
Hayler Edu ▴ 30



I'm a student trying to do variant calling in sequences of fishes (Ictalurus punctatus). Generally in the human genome, I use the tool variants calling using GATK but before I have to do a recalibration with the tool BaseRecalibrator of GATK. To do the recalibration with the human genome the BaseRecalibrator requires two databases (dbsnp_138.hg19.vcf, Mills_and_1000G_gold_standard.indels.hg19.sites.vcf).

In the case of the sequences of Ictalurus punctatus, what are the databases that I need do the recalibration with BaseRecalibrator?

3 months ago

I suppose, there aren't any.

The name Mills_and_1000G_gold_standard.indels.hg19.sites.vcf suggests that the data is based on the 1000 Genomes project, and I presume that the Channel catfish has not been sequenced as comprehensively. At least Ensembl has no Variant database for that organism. Maybe there is a specialized database somewhere, but I fear that you will have to do recalibration based on your own sequencing data.

You can do this e.g. with BBTools approximately like so:

Quality score recalibration (be sure to trim adapters first):
callvariants.sh in=mapped.sam out=initial.vcf ref=ref.fa ploidy=X
calctruequality.sh in=mapped.sam
bbduk.sh in=mapped.sam out=recal.sam recalibrate
callvariants.sh in= recal.sam out=final.vcf ref=ref.fa ploidy=X


After unpacking BBTools, you will find more info in the ./docs/guides folder. Make sure that you choose the correct ploidy, since some fishes are polyploid and this catfish species even seems to have a variable ploidy. Ideally, the ploidy exactly from the investigated specimen should be determined, if there are still tissue samples left in the freezer.

If in doubt, I would search the literature for variant calling in fishes. At least for Danio rerio and economically important fishes like salmon, there should be variant data out there - as far as I know also the Atlantic Salmon has a variable ploidy, therefore is probably a suitable template organism.



I'm writing you because I have a problem using bbmap.sh :(

When I try to execute the script I have this error: /home/hayler99/miniconda3/envs/bbtools/bin/bbmap.sh: line 347: java: command not found

I tried to put the PATH in the bashrc but it didn't work, so I don't know what to do.

This is my script: bbmap.sh ref=GCF_001660625.2_IpCoco_1.2_genomic.fna -I pimm1_Pg594_il.fastq.gz -O pimm1_Pg594_il.sam

I hope you can help me :)



Thanks and a happy new year to you, too.

You need to check, that which java returns the path to the executable. I suppose, you have a Java distribution installed, so it is indeed a problem with $PATH. Are you using Bash? Because if you are on MacOS, the default terminal emulator is the Z shell and you would need to put the modified $PATH in ~/.zshrc or ~/.zprofile.