Question: Warning errors in snpEff annotation results
0
gravatar for maricom
3 months ago by
maricom0
maricom0 wrote:

Hi,

I annotated variants in a bacterial genome using snpEff, but when I saw snpEff_summary.html, there were 944 warnings colored in yellow. Varians were detected using gatk HaplotypeCaller.

When I saw the warnings, they were almost WARNING_TRANSCRIPT_NO_START_CODON. However, when I checked both reference genome and CDS, both have start codon.

I have no idea why they were tagged as warnings.

If anyone has any idea, that would help me a lot.

Thank you.

I annotated using this command

 java -jar snpEff.jar -c snpEff.config -i vcf -o vcf bacteria1 SNPs_counted_using_HaplotypeCaller.vcf 1> res.vcf

one of the results I got

bacteria1   99501   .   C   A   697.6   PASS    AC=1;AF=0.500;AN=2;BaseQRankSum=2.555;DP=187;ExcessHet=3.0103;FS=0.784;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=3.73;ReadPosRankSum=0.989;SOR=0.762;ANN=A|upstream_gene_variant|MODIFIER|D9_0073|GENE_D9_0073|transcript|TRANSCRIPT_D9_0073|protein_coding||c.-4774G>T|||||4774|,A|upstream_gene_variant|MODIFIER|D9_0074|GENE_D9_0074|transcript|TRANSCRIPT_D9_0074|protein_coding||c.-4536G>T|||||4536|,A|upstream_gene_variant|MODIFIER|D9_0078|GENE_D9_0078|transcript|TRANSCRIPT_D9_0078|protein_coding||c.-768G>T|||||768|,A|upstream_gene_variant|MODIFIER|D9_0079|null|transcript|D9_0079|protein_coding||c.-698C>A|||||609|WARNING_TRANSCRIPT_NO_START_CODON,A|upstream_gene_variant|MODIFIER|D9_0080|GENE_D9_0080|transcript|TRANSCRIPT_D9_0080|protein_coding||c.-2444C>A|||||2444|,A|upstream_gene_variant|MODIFIER|D9_0081|GENE_D9_0081|transcript|TRANSCRIPT_D9_0081|protein_coding||c.-3194C>A|||||3194|,A|upstream_gene_variant|MODIFIER|D9_0082|GENE_D9_0082|transcript|TRANSCRIPT_D9_0082|protein_coding||c.-4759C>A|||||4759|,A|downstream_gene_variant|MODIFIER|D9_0075|GENE_D9_0075|transcript|TRANSCRIPT_D9_0075|protein_coding||c.*3987C>A|||||3987|,A|downstream_gene_variant|MODIFIER|D9_0076|GENE_D9_0076|transcript|TRANSCRIPT_D9_0076|protein_coding||c.*3083C>A|||||3083|,A|downstream_gene_variant|MODIFIER|D9_0077|GENE_D9_0077|transcript|TRANSCRIPT_D9_0077|protein_coding||c.*2167C>A|||||2167|,A|intergenic_region|MODIFIER|D9_0078-D9_0079|GENE_D9_0078-null|intergenic_region|GENE_D9_0078-null|||n.99501C>A||||||  GT:AD:DP:GQ:PL  0/1:157,30:187:99:705,0,5786

I created my gtf file like this

seqname     source   feature start end   score strand frame attribute
bacteria1   bacteria1   CDS 101     1507    .   +   0   gene id "D9_0001";
bacteria1   bacteria1   CDS 1569    2666    .   +   0   gene id "D9_0002";
bacteria1   bacteria1   CDS 2663    4378    .   +   0   gene id "D9_0003";

I created my own database and adding this to snpEff.config

bacteria1.genome :bacteria1
bacteria1.chromosomes : bacteria1
bacteria1.bacteria1.codonTable : Bacterial_and_Plant_Plastid
snp • 189 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by maricom0

Perhaps also contact the author Pablo Cingolani pcingola@users.sourceforge.net and cross-reference this thread. Be sure to have a read on Asking for help to provide the necessary information.

ADD REPLYlink written 3 months ago by SMK1.8k

Hi SMK, Thank you for your advice! I've sent the question to him, too.

ADD REPLYlink written 3 months ago by maricom0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1167 users visited in the last hour