Question: How to make interval list for GBS sequencing?
0
gravatar for rimgubaev
6 months ago by
rimgubaev140
Russia/Moscow/Skoltech
rimgubaev140 wrote:

Hello everyone! I am pretty new to GATK. And now I face the following problem: I got g.vcf files for my samples produced by HaplotypeCaller and then I want to consolidate GVCFs using GenomicsDBImport which requires interval list and then proceed to GenotypeGVCFs in order to make joint VCF. How can I make the interval list? My object is rapeseed (Brassica oleracea). Many thanks in advance.

gbs gatk4 • 313 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by rimgubaev140
1

If I understand you correctly, you are analyzing your Genotyping-by-sequencing data with GATK? I think it might be prudent to use a reduced-representation analysis method such as STACKS (http://catchenlab.life.illinois.edu/stacks/) or ipyrad(https://ipyrad.readthedocs.io/) for this purpose.

ADD REPLYlink written 6 months ago by jean.elbers1.3k

Thanks, I will try! And why GATK could be inappropriate for genotyping-by-sequencing technique on your opinion?

ADD REPLYlink written 6 months ago by rimgubaev140

I don't think it would be inappropriate for variant-calling per se, but I wonder about the down-stream analysis leading up to GATK variant calling. Others with more knowledge could pipe in if you make a separate thread or quickly Google pros and cons of reduced-representation analysis pipelines versus say the good-ol' GATK best practices. Presumably you have a reference genome that you like to get this far, so I would recommend the reference-based GBS analysis pipeline in STACKS.

ADD REPLYlink written 6 months ago by jean.elbers1.3k
1
gravatar for rimgubaev
6 months ago by
rimgubaev140
Russia/Moscow/Skoltech
rimgubaev140 wrote:

I tried to solve the problem in 2 ways:

1) Using CombineGVCFs which produced a combined vcf file:

samples=$(find . | sed 's/.\///' | grep -E 'g.vcf$' | sed 's/^/--variant /') # place sample paths into variable
path/to/gatk --java-options "-Xmx36G" CombineGVCFs \
       $(echo $samples) \
       -O path/to/combined.vcf \
       -R path/to/ref.fa

2) Using GenomicsDBImport (after making bed file for my reference) which produced a my_database folder:

samtools faidx ref.fna # make .fai file 
awk '{print $1 "\t0\t" $2}' ref.fna.fai > ref.fna.bed # make .bed file

samples=$(find . | sed 's/.\///' | grep -E 'g.vcf$' | sed 's/^/--variant /') # place sample paths into variable
path/to/gatk --java-options "-Xmx36G" GenomicsDBImport \
          $(echo $samples)\
          --genomicsdb-workspace-path my_database \
          --intervals path/to/ref.fna.bed

After that, I run GenotypeGVCFs using "my_database" or "combined.vcf"

 path/to/gatk --java-options "-Xmx36G" GenotypeGVCFs \
           -R path/to/ref.fna \
           -V gendb://path/to/my_database OR -V path/to/combined.vcf \
           -O path/to/genotypes.vcf

In both cases, I got the same number of SNPs in genotypes.vcf.

ADD COMMENTlink modified 6 months ago • written 6 months ago by rimgubaev140

Does this answer your question then?

ADD REPLYlink written 6 months ago by genomax72k

Yes, it is exactly what I have looking for.

ADD REPLYlink written 6 months ago by rimgubaev140
1
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

which requires interval list

for which option ? -L ? it is well described in https://software.broadinstitute.org/gatk/documentation/article?id=11009

ADD COMMENTlink written 6 months ago by Pierre Lindenbaum123k

Thanks, I've already read this article and still do not understand where I can download interval list for the rapeseed? Or is it appropriate to produce interval list by myself using reference fasta and bed tools and then take it as an input?

ADD REPLYlink written 6 months ago by rimgubaev140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1600 users visited in the last hour