variant calling for more than 60 samples using haplotype caller
1
0
Entering edit mode
6.0 years ago

Hi I want to do variant calling for more than 60 samples using haplotype caller. I want to know, I should do it with multiple samples using this command :

 java -Xmx16g -jar GenomeAnalysisTK-3.8-0-ge9d806836/GenomeAnalysisTK.jar -R Equus_caballus.EquCab2.dna.toplevel.fa -T HaplotypeCaller  -I sample-1.bam -I sample-2  -I sample-3  ......-I sample-60 -ERC GVCF  -o output.vcf.gz

I want to know is it a true method?

next-gen SNP snp • 2.1k views
ADD COMMENT
1
Entering edit mode

What do you mean by "true method"? Also, please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.

code_formatting

ADD REPLY
0
Entering edit mode

thank you for your attention. I mean can I do variant calling for 60 samples or I should do it separately for each sample? because when I do it separately for each sample, I have only two Genotypes per SNP (0/1 or 1/1) in g.vcf files.

ADD REPLY
0
Entering edit mode

Hi siyavash_damdar,

Please follow up on all your previous questions and provide feedback.

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

Note that future questions might be closed if you do not provide feedback on older threads.

Cheers,
Wouter

ADD REPLY
0
Entering edit mode

Hi Thank you for useful comment. Yes of course, I did it. The best, Siavash

ADD REPLY
2
Entering edit mode
6.0 years ago
Ram 43k

You can call sample by sample and joint genotype - that way, you shouldn't lose cross-sample genotype information. Or, you could call all samples per region, where each region can be a chromosome or smaller so you reduce computational burden.

You'd just need different GATK tools to Combine/Cat Variants based on the division you choose.

You could even combine both if you'd like to. The choice is best made based on time and computational resources available to you.

ADD COMMENT
0
Entering edit mode

so you mean my command is not a true way for all samples variant calling?

java -Xmx16g -jar GenomeAnalysisTK-3.8-0-ge9d806836/GenomeAnalysisTK.jar -R Equus_caballus.EquCab2.dna.toplevel.fa -T HaplotypeCaller  -I sample-1.bam -I sample-2  -I sample-3  ......-I sample-60 -ERC GVCF  -o output.vcf.gz
ADD REPLY
0
Entering edit mode

If you're looking for someone to give you the "right" command, sorry, I'm not that person. I've outlined possible approaches, and it is up to you to choose one and implement it.

What you're doing above is using neither of those approaches - you're calling all samples across all regions in a single thread with 16G RAM. That's fine as long as you're ready to wait a long time and risk everything on one thread.

ADD REPLY
0
Entering edit mode

what are you thinking if I increase my memory an thread (RAM 64 and 16 thread)?

ADD REPLY
0
Entering edit mode

what are you thinking

I think it will be faster. GATK has multiple levels of parallelism, multithreading is just one of them.

ADD REPLY
0
Entering edit mode

so, according to your comments, I can have 3 scenarios: 1. call sample by sample and joint genotype 2.call all samples per region 3. call all samples (all regions) using multithreading so, Are there any differences in the results (genotypes) of these scenarios?

ADD REPLY
0
Entering edit mode

We're getting into a rabbit hole now, where the principal question was something else and the discussion devolves into an one-on-one between OP and an answerer. This is not good and I cannot encourage this.

If you're really curious about this, search for posts that address this question. Better yet, Google it. The better you are at Google-ing stuff, the faster you can solve your problems.

ADD REPLY
0
Entering edit mode

sorry if bother you. Thanks

ADD REPLY
0
Entering edit mode

It's not a bother, it's not personal. It's just not something to be encouraged too much, as the discussion goes from being something useful to a lot of people to a niche conversation useful just to one person.

ADD REPLY

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6