Issues generating the annotated.eff.vcf file : Sanity check his should never happen
0
0
Entering edit mode
2.5 years ago
cae453 • 0

Hello,

I am having a hard time deciphering the error associated with the vcf annotation file. I used the command:

java -Xmx8g -jar snpEff.jar Tair10.1 /globalhome/cae453/HPC/sample6.vcf > /globalhome/cae453/HPC/sample6.eff.vcf

The output is

java.lang.RuntimeException: Sanity check: This should never happen!
        at org.snpeff.interval.Gene.circularClone(Gene.java:195)
        at org.snpeff.interval.Genes.createCircularGenes(Genes.java:53)
        at org.snpeff.snpEffect.SnpEffectPredictor.buildForest(SnpEffectPredictor.java:146)
        at org.snpeff.SnpEff.loadDb(SnpEff.java:617)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:940)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:923)
        at org.snpeff.SnpEff.run(SnpEff.java:1188)
        at org.snpeff.SnpEff.main(SnpEff.java:168)

In the end, the eff.vcf file is empty

Previously, to perform the annotation I had to build a genome database (Tair10.1) executing the following command:

Building database java -jar snpEff.jar build -gtf22 -v Tair10.1. The database doesn't have errors apparently:

Remove empty chromosomes:

        Marking as 'coding' from CDS information:
        Done: 48147 transcripts marked
#-----------------------------------------------
# Genome name                : 'Arabidopsis_thaliana'
# Genome version             : 'Tair10.1'
# Genome ID                  : 'Tair10.1[0]'
# Has protein coding info    : true
# Has Tr. Support Level info : true
# Genes                      : 38311
# Protein coding genes       : 27444
#-----------------------------------------------
# Transcripts                : 59994
# Avg. transcripts per gene  : 1.57
# TSL transcripts            : 0
#-----------------------------------------------
# Checked transcripts        :
#               AA sequences :      0 ( 0.00% )
#              DNA sequences :      0 ( 0.00% )
#-----------------------------------------------
# Protein coding transcripts : 48147
#              Length errors :     35 ( 0.07% )
#  STOP codons in CDS errors :     33 ( 0.07% )
#         START codon errors :     63 ( 0.13% )
#        STOP codon warnings :     38 ( 0.08% )
#              UTR sequences :  44634 ( 74.40% )
#               Total Errors :    100 ( 0.21% )
#-----------------------------------------------
# Cds                        : 286237
# Exons                      : 324728
# Exons with sequence        : 324728
# Exons without sequence     : 0
# Avg. exons per transcript  : 5.41
# WARNING                    : No mitochondrion chromosome found
#-----------------------------------------------
# Number of chromosomes      : 7
# Chromosomes                : Format 'chromo_name size codon_table'
#               'NC_003070.9'   30427671        Standard
#               'NC_003076.8'   26975502        Standard
#               'NC_003074.8'   23459830        Standard
#               'NC_003071.7'   19698289        Standard
#               'NC_003075.7'   18585056        Standard
#               'NC_037304.1'   367808  Standard
#               'NC_000932.1'   154478  Standard
#-----------------------------------------------

I also tried building the database using GFF.file but appears more warning messages and error in the built database associated with Start and Stop codon. Even though when I use this database to perform the vcf file annotation the file is not empty but has a lot of WARNING_TRANSCRIPT_NO_START_CODON.

At this moment I don't know how to fix the error associated with the empty vcf. file I will really appreciate any help that you can provide me.

Thank you!

Carlos Erazo

snpEff • 573 views
ADD COMMENT

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6