Fatal error reading file snpEff.config with SnpEff
1
0
Entering edit mode
4.0 years ago

Hi, I have recently been introduced to bioinformatics, Variant Calling and the necessary analysis for it. I have obtained several VCF files from FreeBayes and I wanted to run them through snpEff too. This way I could annotate de variants between two genomes of the genus Shewanella. Sadly I have not been able to do this since I keep getting the same error over and over. Here is the code and the running process with the error included:

(base) binso@LAPTOP-P73O7IPS:~/Alignment/VariantCalling$ java -jar ../../miniconda3/pkgs/snpeff-4.3.1t 3/share/snpeff-4.3.1t-3/snpEff.jar Shewanella_putrefaciens_cn_32 ../../miniconda3/pkgs/snpeff-4.3.1t-3/share/snpeff-4.3.1t-3/snpEff.config -v -s VC_SPt_ST2-3D_6.vcf>VC_snpEff.vcf                                                      

00:00:00        SnpEff version SnpEff 4.3t (build 2017-11-24 10:18), by Pablo Cingolani 
00:00:00        Command: 'ann' 
00:00:00        Reading configuration file 'snpEff.config'. Genome: 'Shewanella_putrefaciens_cn_32'  
00:00:00        Reading config file: /home/binso/Alignment/VariantCalling/snpEff.config
00:00:00        Reading config file: /home/binso/miniconda3/pkgs/snpeff-4.3.1t-3/share/snpeff-4.3.1t-3/snpEff.config
00:00:00        done                                                                                                                                                   
00:00:00        Reading database for genome version 'Shewanella_putrefaciens_cn_32' from file'/home/binso/miniconda3/pkgs/snpeff4.3.1t3/share/snpeff4.3.1t3/./data/
        Shewanella_putrefaciens_cn_32/snpEffectPredictor.bin' (this might take a while)                                                                                         00:00:01        done                                                                                                                                                   
00:00:01        Loading Motifs and PWMs                                                                                                                                 00:00:01        Building interval forest                                                                                                                                00:00:02        done.
00:00:02        Genome stats :                                                                                                                                         
        #-----------------------------------------------                                                                                                                      
        # Genome name : 'Shewanella_putrefaciens_cn_32'                                                                                                   
        # Genome version : 'Shewanella_putrefaciens_cn_32'                                                                                                         
        # Genome ID: 'Shewanella_putrefaciens_cn_32[0]'                                                                                                        
        # Has protein coding info: true                                                                                                                                     
        # Has Tr. Support Level info : true                                                                                                                                    
        # Genes : 4331                                                                                                                                     
        # Protein coding gene: 4171                                                                                                                           
        #-----------------------------------------------                                                                                                                        
        # Transcripts: 4331                                                                                                                                     
        # Avg. transcripts per gene: 1.00                                                                                                                                    
        # TSL transcripts: 0                                                                                                                                        
        #-----------------------------------------------                                                                                                                        
        # Checked transcripts:                                                                                                                                          
        # AA sequences :   3972 ( 95.23% )                                                                                                                       
        # DNA sequences :      0 ( 0.00% )                                                                                                                         
        #-----------------------------------------------                                                                                                                        
        # Protein coding transcripts : 4171                                                                                                                             
        #Length errors :    182 ( 4.36% )                                                                                                                         
        #STOP codons in CDS errors :     29 ( 0.70% )                                                                                                                         
        #START codon errors :    535 ( 12.83% )                                                                                                                        
        #STOP codon warnings :     17 ( 0.41% )                                                                                                                        
        #UTR sequences :      0 ( 0.00% )                                                                                                                         
        #Total Errors :    540 ( 12.95% )                                                                                                                        
        # WARNING: No protein coding transcript has UTR                                                                                                     
        #-----------------------------------------------                                                                                                                     
        # Cds : 3972                                                                                                                             
        #Exons: 4331                                                                                                                              
        # Exons with sequence : 4331                                                                                                                                     
        # Exons without sequence     : 0                                                                                                                                        
        # Avg. exons per transcript  : 1.00                                                                                                                                    
        #-----------------------------------------------                                                                                                                      
        # Number of chromosomes      : 1                                                                                                                                        
        # Chromosomes                : Format 'chromo_name size codon_table'                                                                                                    
        # 'Chromosome'    4659220 Standard                                                                                                                       
        #-----------------------------------------------

00:00:02        Predicting variants       
    VcfFileIterator.parseVcfLine(132):  Fatal error readingfile'../../miniconda3/pkgs/snpeff4.3.1t3/share/snpeff4.3.1t3/snpEff.config' (line: 17): 
    data.dir = ./data/                                                                                                                                                      java.lang.RuntimeException: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?).
    data.dir = ./data/                                                                                                                                                              
    at org.snpeff.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:133)                                                                                       
    at org.snpeff.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:184)                                                                                           
    at org.snpeff.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:57)                                                                                            
    at org.snpeff.fileIterator.FileIterator.hasNext(FileIterator.java:123)                                                                                                  
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.annotateVcf(SnpEffCmdEff.java:467)                                                                                     
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.annotate(SnpEffCmdEff.java:142)                                                                                        
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:1029)                                                                                            
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:984)                                                                                             
    at org.snpeff.SnpEff.run(SnpEff.java:1183)                                                                                                                              
    at org.snpeff.SnpEff.main(SnpEff.java:162)   

Caused by: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?).
data.dir = ./data/                                                                                                                                                               
at org.snpeff.vcf.VcfEntry.parse(VcfEntry.java:1007)                                                                                                                    
at org.snpeff.vcf.VcfEntry.<init>(VcfEntry.java:219)                                                                                                                    
at org.snpeff.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:130)
... 9 more                                                                                                                                                     
00:00:02       Logging                                                                                                                                                                    
00:00:03        Checking for updates...
00:00:05        Done.

I'm not sure if this is due to the snpEff.config file or my VCF file. Any possible tip or recommendation would be highly appreciated.

SnpEff Fatal error Variant Calling • 1.9k views
ADD COMMENT
1
Entering edit mode
4.0 years ago

Caused by: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?).

it's a problem with your vcf file. Check it's a proper vcf file (eg. try to read it with bcftools view )

ADD COMMENT
0
Entering edit mode

Thank you for replying, I have tried that with the following command and it seems to work alright:

(base) binso@LAPTOP-P73O7IPS:~/Alignment/VariantCalling$ bcftools view --header-only VC_SPt_ST2-3D_2.vcf

##fileformat=VCFv4.2                                                                     
##FILTER=<ID=PASS,Description="All filters passed">                                                                 
##fileDate=20200414                                                                                                                          
##source=freeBayes v1.3.2-dirty                                                                                              
##reference=12SPt.fasta

It keeps going for a while so I just pasted some of the first lines

ADD REPLY
0
Entering edit mode

It keeps going for a while so I just pasted some of the first lines

can you please try _without_ --header-only ...

ADD REPLY
0
Entering edit mode
(base) binso@LAPTOP-P73O7IPS:~/Alignment/VariantCalling$ bcftools view VC_SPt_ST2-3D_2.vcf

It prints the whole input file like this:

Shewanella_putrefaciens_CN_32   3954896 .       A       G       2518.72 .       AB=0;ABP=0;AC=1;AF=1;AN=1;AO=84;CIGAR=1X;DP=84;DPB=84;DPRA=0;EPP=4.66476;EPPR=0;GTI=0;LEN=1;MEANALT=1;MQM=41.2738;MQMR=0;NS=1;NUMALT=1;ODDS=579.957;PAIRED=0.357143;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=2996;QR=0;RO=0;RPL=53;RPP=15.5221;RPPR=0;RPR=31;RUN=1;SAF=37;SAP=5.59539;SAR=47;SRF=0;SRP=0;SRR=0;TYPE=snp     GT:DP:AD:RO:QR:AO:QA:GL 1:84:0,84:0:0:84:2996:-256.727,0                                                     
Shewanella_putrefaciens_CN_32   3955123 .       C       T       1794.75 .       AB=0;ABP=0;AC=1;AF=1;AN=1;AO=61;CIGAR=1X;DP=61;DPB=61;DPRA=0;EPP=7.31765;EPPR=0;GTI=0;LEN=1;MEANALT=1;MQM=41.3607;MQMR=0;NS=1;NUMALT=1;ODDS=413.255;PAIRED=0.442623;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=2132;QR=0;RO=0;RPL=29;RPP=3.33068;RPPR=0;RPR=32;RUN=1;SAF=32;SAP=3.33068;SAR=29;SRF=0;SRP=0;SRR=0;TYPE=snp     GT:DP:AD:RO:QR:AO:QA:GL 1:61:0,61:0:0:61:2132:-182.941,0                                                     
Shewanella_putrefaciens_CN_32   3955181 .       G       T       2069.34 .       AB=0;ABP=0;AC=1;AF=1;AN=1;AO=74;CIGAR=1X;DP=75;DPB=75;DPRA=0;EPP=12.5178;EPPR=5.18177;GTI=0;LEN=1;MEANALT=1;MQM=41.4459;MQMR=42;NS=1;NUMALT=1;ODDS=476.482;PAIRED=0.445946;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=2487;QR=25;RO=1;RPL=28;RPP=12.5178;RPPR=5.18177;RPR=46;RUN=1;SAF=30;SAP=8.76177;SAR=44;SRF=1;SRP=5.18177;SRR=0;TYPE=snp GT:DP:AD:RO:QR:AO:QA:GL 1:75:1,74:1:25:74:2487:-211.598,0
ADD REPLY

Login before adding your answer.

Traffic: 2398 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6