I need help with running snpEff analysis for wheat IWGSCv1.0.
1
0
Entering edit mode
2.2 years ago

I downloaded the GFF3 file in https://urgi.versailles.inra.fr/download/iwgsc/IWGSC_RefSeq_Annotations/v1.0/ and Fasta file in https://urgi.versailles.inra.fr/download/iwgsc/IWGSC_RefSeq_Assemblies/v1.0/ I am running snpEFF analysis in the Cygwin terminal. In the data folder, I made 2 directories IWGSCv1.0 and genomes. I placed the .gff3 file in the IWGSCv1.0 folder and IWGSCv1.0.fa in the genomes folder and run the code as follows:

$ java -jar snpEff.jar build -gff3 -v IWGSCv1.0

I followed step-by-step instructions to build a new genome. snpEff is reading the .gff file but shows an error (IWGSCv1.0.fa not found) while reading fasta file. snpEff reads GFF file but cannot access Fasta file. The output is as follows.

Total: 1669182 markers added.
00:00:27 Create exons from CDS (if needed):
00:00:27 Exons created for 0 transcripts.
00:00:27 Deleting redundant exons (if needed):
00:00:29 Total transcripts with deleted exons: 0
00:00:29 Collapsing zero length introns (if needed):
00:00:33 Total collapsed transcripts: 0
00:00:33 Reading sequences :
00:00:33 FASTA file: 'C:\Users\HP\snpEff_wheat\snpEff_latest_core\snpEff/./data/genomes/IWGSCv1.fa' not found.
00:00:33 FASTA file: 'C:\Users\HP\snpEff_wheat\snpEff_latest_core\snpEff/./data/IWGSCv1/sequences.fa' not found.
java.lang.RuntimeException: Cannot find reference sequence.
at org.snpeff.snpEffect.factory.SnpEffPredictorFactory.readExonSequences(SnpEffPredictorFactory.java:702)
at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryGff.readExonSequences(SnpEffPredictorFactoryGff.java:450)
at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryGff.create(SnpEffPredictorFactoryGff.java:347)
at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:404)
at org.snpeff.SnpEff.run(SnpEff.java:1141)
at org.snpeff.SnpEff.main(SnpEff.java:160)
java.lang.RuntimeException: Error reading file 'C:\Users\HP\snpEff_wheat\snpEff_latest_core\snpEff/./data/IWGSCv1/genes.gff'
java.lang.RuntimeException: Cannot find reference sequence.
at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryGff.create(SnpEffPredictorFactoryGff.java:359)
at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:404)
at org.snpeff.SnpEff.run(SnpEff.java:1141)
at org.snpeff.SnpEff.main(SnpEff.java:160)
00:00:33 Logging
00:00:34 Checking for updates...
00:00:36 Done.

Any help with finding the error will be greatly appreciated. Thank you.

snpEFF IWGSCv1.0 wheat • 666 views
ADD COMMENT
0
Entering edit mode
2.2 years ago
Frogs • 0

Well you are specifying IWGSCv1.0 but the error is saying that /data/genomes/IWGSCv1.fa cannot be found. Looks to be that SNPeff isn't seeing the .0

I'd try to rename your IWGSCv1.0 file to be sequences.fa and place it in a new folder called IWGSCv1 like it is looking for in this error message:

/data/IWGSCv1/sequences.fa'

I'd do the same for the GFF too

ADD COMMENT
0
Entering edit mode

Thank you. It still returns same error. It is unable read Fasta file or even concatenate it with gff file.

ADD REPLY

Login before adding your answer.

Traffic: 1448 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6