Question: Interval list for GATK tool Haplotype Caller
gravatar for jeremy
4 months ago by
jeremy0 wrote:

Hello, I am currently using GATK's tool haplotypecaller to do variant discovery for some RNA-seq data. The is a very long running process so I have been looking at how to ways to optimize speed. It is mentioned that you can pass an interval list to HaplotypeCaller to speed up performance. It mentions that you can pass a vcf as the interval list to use. I am wondering if it is appropriate to use a reference vcf such as Ensembl's GRCH38.vcf file, as this will be intervals for genes and my variant discovery will only be looking within genes since that is the nature of the RNA-seq data.

Does make sense to use? I cant find much in the docs about what kind of interval list to use for RNA-seq data, it is mostly about whole genome or targeted exome. If that vcf is not proper to use, what interval list or how do I create an interval list to be used by HaplotypeCaller to speed up processing for this RNAseq data

snp rna-seq • 269 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by jeremy0

Yes, you can specify a interval list, just use the -L option and point to a BED file containing your intervals!

ADD REPLYlink written 4 months ago by brunobsouzaa350
gravatar for abedkurdi10
4 months ago by
abedkurdi1030 wrote:

As I know, intervals in HaplotypeCaller are BED files (Genomic positions). You specify the regions where to call variants.

ADD COMMENTlink written 4 months ago by abedkurdi1030

However, my question is for RNAseq, are the intervals just going to be every gene, so for example a BED file of the entire transcriptome?

ADD REPLYlink written 4 months ago by jeremy0

Yes, for every gene or for genes of interest

ADD REPLYlink written 4 months ago by abedkurdi1030
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1907 users visited in the last hour