snpEff problems adding regulatory regions to database
0
0
Entering edit mode
8 weeks ago
mcsimenc ▴ 20

Hi all,

I want to add locations of transcription factor binding sites to a pre-built genome (Arabidopsis_thaliana). I am following instructions from the snpEff site (http://pcingola.github.io/SnpEff/se_build_reg/), but I am getting errors about lack of .bin files. It seems they might only be available for some pre-built genomes (http://pcingola.github.io/SnpEff/se_additionalann/), but is there a way for me to create these files for my purpose?

I provided a BED file to a folder here:

snpeff-5.1-2/data/Arabidopsis_thaliana/regulation.bed/regulation.plant.dapseq_peaks.bed

and run this command: snpEff build -v -onlyReg Arabidopsis_thaliana

But get this error:

00:00:00 SnpEff version SnpEff 5.1d (build 2022-04-19 15:49), by Pablo Cingolani
00:00:00 Command: 'build'
00:00:00 Building database for 'Arabidopsis_thaliana'
00:00:00 Reading configuration file 'snpEff.config'. Genome: 'Arabidopsis_thaliana'
00:00:00 Reading config file: /home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/data/Arabidopsis_thaliana/snpEff.config
00:00:00 Reading config file: /home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/snpEff.config
00:00:01 done
00:00:01 [Optional] Reading regulation elements: GFF
WARNING_FILE_NOT_FOUND: Cannot read optional regulation file '/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.gff', nothing done.
00:00:01 [Optional] Reading regulation elements: BED 
00:00:01 Directory has 2 bed files and 1 cell types
00:00:01 Creating consensus for cellType 'plant', files: [/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed.bkp, /home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed]
00:00:01 Reading file '/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed.bkp'
00:00:01        Adding regulatory type: 'plant'
00:00:03 Done
        Total lines                 : 2816462
        Total annotation count      : 169645
        Percent                     : 6.0%
        Total annotated length      : 34278160
        Number of cell/annotations  : 1
00:00:03 Reading file '/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed'
00:00:05 Done
        Total lines                 : 2816462
        Total annotation count      : 339292
        Percent                     : 6.0%
        Total annotated length      : 68556724
        Number of cell/annotations  : 1
00:00:05 Creating consensus for cell type: plant
00:00:05 Sorting: plant , size: 339292
00:00:06 Adding to final consensus
00:00:06 Final consensus for cell type: plant   , size: 169640
java.lang.RuntimeException: java.io.FileNotFoundException: null/regulation_plant.bin (No such file or directory)
        at org.snpeff.serializer.MarkerSerializer.save(MarkerSerializer.java:311)
        at org.snpeff.interval.Markers.save(Markers.java:399)
        at org.snpeff.RegulationFileConsensus.save(RegulationFileConsensus.java:164)
        at org.snpeff.RegulationConsensusMultipleBed.consensusByRegType(RegulationConsensusMultipleBed.java:69)
        at org.snpeff.RegulationConsensusMultipleBed.run(RegulationConsensusMultipleBed.java:139)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.readRegulationBed(SnpEffCmdBuild.java:330)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:441)
        at org.snpeff.SnpEff.run(SnpEff.java:1141)
        at org.snpeff.SnpEff.main(SnpEff.java:160)
Caused by: java.io.FileNotFoundException: null/regulation_plant.bin (No such file or directory)
        at java.base/java.io.FileOutputStream.open0(Native Method)
        at java.base/java.io.FileOutputStream.open(FileOutputStream.java:298)
        at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:237)
        at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:126)
        at org.snpeff.serializer.MarkerSerializer.save(MarkerSerializer.java:300)
        ... 8 more
00:00:06 Logging
00:00:07 Checking for updates...
00:00:08 Done.

Any help with adding custom annotations to Arabidopsis_thaliana database would be much appreciated!

snp snpEff gwas TFBS • 201 views
ADD COMMENT

Login before adding your answer.

Traffic: 2252 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6