SnpEff START/STOP codon not found error
0
1
Entering edit mode
3.8 years ago
thomasbersez ▴ 50

Hi!

I'm using SnpEff to annotate SNP from Varscan. Since I'm working on Casava, I have build a database locally using Casava genome and its annotation in GFF format. When inspecting the database snpEff reports:

snpEff.jar dump snpEff_db | less

...

#-----------------------------------------------
# Genome name                : 'snpEff_db'
# Genome version             : 'snpEff_db'
# Genome ID                  : 'snpEff_db[0]'
# Has protein coding info    : true
# Has Tr. Support Level info : true
# Genes                      : 80115
# Protein coding genes       : 72330
#-----------------------------------------------
# Transcripts                : 95959
# Avg. transcripts per gene  : 1.20
# TSL transcripts            : 0
#-----------------------------------------------
# Checked transcripts        : 
#               AA sequences :      0 ( 0.00% )
#              DNA sequences :      0 ( 0.00% )
#-----------------------------------------------
# Protein coding transcripts : 87317
#              Length errors :    545 ( 0.62% )
#  STOP codons in CDS errors :      0 ( 0.00% )
#         START codon errors :  87310 ( 99.99% )
#        STOP codon warnings :  86772 ( 99.38% )
#              UTR sequences :  40611 ( 42.32% )
#               Total Errors :  87310 ( 99.99% )
#-----------------------------------------------
# Cds                        : 275258
# Exons                      : 335821
# Exons with sequence        : 335821
# Exons without sequence     : 0
# Avg. exons per transcript  : 3.50
# WARNING                    : No mitochondrion chromosome found
#-----------------------------------------------

Almost all protein coding gene have no START, STOP or both! I have also tried with the GTF version of the annotation. Since some annotation are incomplete or corrupted, I have manually inspected CDS position using IGV, and have not seen errors. Does some already faced this issue? How did you fixed it?

Thanks for help,

SNP snpEff software error • 1.3k views
ADD COMMENT
1
Entering edit mode

which genome ? same build genome/GFF ?

ADD REPLY
0
Entering edit mode

I use GCF_001659605.1_Manihot_esculenta_v6_genomic.fna and GCF_001659605.1_Manihot_esculenta_v6_genomic.gff both from NCBI genome database. Yes both are from the same build.

Thanks for the answer!

ADD REPLY
0
Entering edit mode

Thomas, Any progress figuring this out? I have same errors with different species. IGV looks fine.

ADD REPLY
0
Entering edit mode

Sadly no, since it was not mandatory for my project to use SnpEff I moved to VEP... Not a solution at all I know!

ADD REPLY
0
Entering edit mode

Can you try to build database with the Genbank file?

ADD REPLY

Login before adding your answer.

Traffic: 2495 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6