snpEff problems adding regulatory regions to database
8 weeks ago
mcsimenc ▴ 20

Hi all,

I want to add locations of transcription factor binding sites to a pre-built genome (Arabidopsis_thaliana). I am following instructions from the snpEff site (, but I am getting errors about lack of .bin files. It seems they might only be available for some pre-built genomes (, but is there a way for me to create these files for my purpose?

I provided a BED file to a folder here:


and run this command: snpEff build -v -onlyReg Arabidopsis_thaliana

But get this error:

00:00:00 SnpEff version SnpEff 5.1d (build 2022-04-19 15:49), by Pablo Cingolani
00:00:00 Command: 'build'
00:00:00 Building database for 'Arabidopsis_thaliana'
00:00:00 Reading configuration file 'snpEff.config'. Genome: 'Arabidopsis_thaliana'
00:00:00 Reading config file: /home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/data/Arabidopsis_thaliana/snpEff.config
00:00:00 Reading config file: /home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/snpEff.config
00:00:01 done
00:00:01 [Optional] Reading regulation elements: GFF
WARNING_FILE_NOT_FOUND: Cannot read optional regulation file '/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.gff', nothing done.
00:00:01 [Optional] Reading regulation elements: BED 
00:00:01 Directory has 2 bed files and 1 cell types
00:00:01 Creating consensus for cellType 'plant', files: [/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed.bkp, /home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed]
00:00:01 Reading file '/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed.bkp'
00:00:01        Adding regulatory type: 'plant'
00:00:03 Done
        Total lines                 : 2816462
        Total annotation count      : 169645
        Percent                     : 6.0%
        Total annotated length      : 34278160
        Number of cell/annotations  : 1
00:00:03 Reading file '/home/msimenc/software/mambaforge/envs/gwas_test/share/snpeff-5.1-2/./data/Arabidopsis_thaliana/regulation.bed//regulation.plant.dapseq_peaks.bed'
00:00:05 Done
        Total lines                 : 2816462
        Total annotation count      : 339292
        Percent                     : 6.0%
        Total annotated length      : 68556724
        Number of cell/annotations  : 1
00:00:05 Creating consensus for cell type: plant
00:00:05 Sorting: plant , size: 339292
00:00:06 Adding to final consensus
00:00:06 Final consensus for cell type: plant   , size: 169640
java.lang.RuntimeException: null/regulation_plant.bin (No such file or directory)
        at org.snpeff.RegulationConsensusMultipleBed.consensusByRegType(
        at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.readRegulationBed(
        at org.snpeff.SnpEff.main(
Caused by: null/regulation_plant.bin (No such file or directory)
        at java.base/ Method)
        at java.base/
        at java.base/<init>(
        at java.base/<init>(
        ... 8 more
00:00:06 Logging
00:00:07 Checking for updates...
00:00:08 Done.

Any help with adding custom annotations to Arabidopsis_thaliana database would be much appreciated!

