Entering edit mode
7 months ago
balanannie
▴
10
I've been stuck with counting CADD scores for a while now.
I'm working on server with VEP, it annotates my vcf, and uses CADD.pm plugin
Additionally I tried adding LoFtool, and the problem persists: There are LoF columns, but no actual numbers.
The command looks like this:
vep -i .../my_dir/chr4_bottom_25_rare.vcf \
--assembly GRCh38 \
--cache --offline \
--sift b --polyphen b --symbol --vcf \
--dir_cache .../my_dir/VEP \
--dir_plugins .../my_dir/VEP/Plugins \
--plugin CADD,snv=.../my_dir/VEP/Plugins/whole_genome_SNVs.tsv.gz,\
indels=.../my_dir/VEP/Plugins/gnomad.genomes.r3.0.indel.tsv.gz \
--plugin LoFtool \
--force_overwrite \
-o .../my_dir/chr4_bottom_25_rare_cadd.vcf \
--verbose \
--fork 4 \
--debug
there are warnings like this in log:
WARNING: 1821923 : Use of uninitialized value $file in hash element at .../my_dir/VEP/Plugins/CADD.pm line 260, <__ANONIO__> line 5001.
Use of uninitialized value in split at .../my_dir/VEP/Plugins/CADD.pm line 260, <__ANONIO__> line 5001.
Use of uninitialized value $s in addition (+) at .../my_dir/VEP/Plugins/CADD.pm line 278, <__ANONIO__> line 5001.
Use of uninitialized value in numeric ne (!=) at .../my_dir/VEP/Plugins/CADD.pm line 279, <__ANONIO__> line 5001.
Use of uninitialized value $alt in substr at .../my_dir/VEP/Plugins/CADD.pm line 281, <__ANONIO__> line 5001.
Use of uninitialized value $file in hash element at .../my_dir/VEP/Plugins/CADD.pm line 260, <__ANONIO__> line 5001.
Use of uninitialized value in split at .../my_dir/VEP/Plugins/CADD.pm line 260, <__ANONIO__> line 5001.
...
directories and files are:
$ ls VEP/
homo_sapiens plugin_config.txt Plugins wget-log
$ ls -ahl VEP/Plugins/
total 82G
drwx------ 2 abalan __USERS__ 249 10 22:27 .
drwx------ 4 abalan __USERS__ 102 9 00:58 ..
-rwx------ 1 abalan __USERS__ 8,8K 10 22:27 CADD.pm
-rwxr--r-x 1 abalan __USERS__ 1,1G 2 2020 gnomad.genomes.r3.0.indel.tsv.gz
-rwxr-xr-x 1 abalan __USERS__ 1,8M 2 2020 gnomad.genomes.r3.0.indel.tsv.gz.tbi
-rwx------ 1 abalan __USERS__ 3,1K 10 22:10 LoFtool.pm
-rwx------ 1 abalan __USERS__ 233K 10 22:11 LoFtool_scores.txt
-rwxr--r-x 1 abalan __USERS__ 81G 24 2020 whole_genome_SNVs.tsv.gz
-rwx------ 1 abalan __USERS__ 2,7M 26 2020 whole_genome_SNVs.tsv.gz.tbi
The string INFO from output:
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|FLAGS|SYMBOL_SOURCE|HGNC_ID|SIFT|PolyPhen|CADD_PHRED|CADD_RAW|LoFtool">
tbi are present and tabix works. Ref files seem to be fine and have right hg38 version, like:
zcat whole_genome_SNVs.tsv.gz | head
## CADD GRCh38-v1.6 (c) University of Washington, Hudson-Alpha Institute for Biotechnology and Berlin Institute of Health 2013-2020. All rights reserved.
#Chrom Pos Ref Alt RawScore PHRED
1 10001 T A 0.702541 8.478
1 10001 T C 0.750954 8.921
1 10001 T G 0.719549 8.634
Additionally:
Versions:
ensembl : 105.f357e33
ensembl-funcgen : 105.660df8f
ensembl-io : 105.2a0a40c
ensembl-variation : 105.ac8178e
ensembl-vep : 105.0
CADD.pm release/113
I've just ran out of ideas about what can I possibly do wrong... Please, help