Question: using GATK UnifiedGenotyper on a single chromosome
1
gravatar for stefano.iantorno
4.6 years ago by
United Kingdom
stefano.iantorno70 wrote:

Hello

I am trying to run GATK UnifiedGenotyper on a single chromosome from a 30+ chromosome genome.

I have tried feeding GATK the bam file (containing reads for all chromosomes plus some scaffolds) and a fasta file of only the one chromosome I want variants for. However I receive the following error:

ERROR MESSAGE: Badly formed genome loc: Contig Scaffold143 given as location, but this contig isn't present in the Fasta sequence dictionary

Where Scaffold143 is one of the unassembled scaffolds.

I am assuming that the error comes from the bam file containing reads for that scaffold? How do I solve this? I somehow stumbled upon the -L argument for UnifiedGenotyper but I could not find any documentation for it on the GATK website. Is this an argument that defines the boundaries of a region on which to call variants? If so, this could be a way to work around the problem generated by feeding GATK only the fasta file of the chromosome I'm interested in.

 

 

java gatk genome • 2.5k views
ADD COMMENTlink modified 4.0 years ago by Biostar ♦♦ 20 • written 4.6 years ago by stefano.iantorno70

In case you want to call variants for one chromosome you can use "-L" or "Intervals of interest" parameter in GATK. You can give an input file that will have <chr>:<start>-<stop> for your chromosome of interest. This way GATK Unified genotyper will call variants for only that chromosome.

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Ashutosh Pandey11k
2
gravatar for Paweł Sztromwasser
4.6 years ago by
University of Bergen, Norway
Paweł Sztromwasser20 wrote:

You are right. If the bam contains reads mapped to position that is not present in your (filtered) reference, GATK will complain. As you also noticed, you can use the -L parameter of GATK. The documentation is here:

https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_engine_CommandLineGATK.html#--intervals

Remember to provide as the reference the same file that was used for mapping (all chromosomes and scaffolds)

 

ADD COMMENTlink written 4.6 years ago by Paweł Sztromwasser20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1195 users visited in the last hour