snpEff generating lots of WARNING_REF_DOES_NOT_MATCH_GENOME
1
0
Entering edit mode
9.4 years ago
hawkcharles ▴ 10

So I'm trying to use snpEff to annotate the effects of variants, on an organism without much public data, and I'm getting a lot of warnings that the reference does not match the genome. I've tried both freeBayes and GATK to call variants and get these warnings from snpEff in either case, despite using the same genome reference.

In the case of freeBayes, I'm running on BAM files we made based on our own sequencing data, it like so:

freebayes --fasta-reference organism_123.fa */*Realigned.bam > variants.fb.vcf

and then snpEff:

java -jar snpEff.jar -v organism.123 variants.fb.vcf > fb.eff.vcf

The database organism.123 was one I generated with snpEff from a .gff file since there isn't a db available publicly. The .gff is gzipped as data/organism.123/genes.gff.gz and a copy of organism_123.fa was gzipped as data/genomes/organism.123.fa.gz. I made the db with:

java -jar snpEff.jar build -gff3 -v organism.123

I get thousands of the ref-does-not-match-genome warnings in snpEff's output, along with an order of magnitude more no-start-codon warnings. The latter could mean my .gff is bad somehow but that couldn't cause the former errors, could it? FreeBayes never even saw that file. Anything obvious I'm doing wrong here?

It might be informative to see what snpEff thinks the reference is in these cases, but I don't see that in the annotations it produces.

SNP • 3.1k views
ADD COMMENT
1
Entering edit mode
9.4 years ago
hawkcharles ▴ 10

Okay, I'm an idiot. I had reflexively used tar cvf to gzip the .gff and .fa files rather than gzip. This of course put TAR headers at the beginning of each file and these were, understandably, confusing snpEff.

And for seeing what snpEff thinks the reference is, I discovered the very useful dump database command:

java -jar snpEff.jar dump organism.123 | less
ADD COMMENT
0
Entering edit mode

Yes that dump command is very handy - should have thought of it. Has helped me out on some occasions.

Thanks for letting us know about the solution

ADD REPLY

Login before adding your answer.

Traffic: 2787 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6