I'm currently in the process of analyzing whole-exome and RNA sequencing data on a cancer cell line and attempting to see how many genes consists of deleterious mutations.
I have performed quality control, alignment/mapping (BWA for WES and STAR for RNA-Seq), and variant calling (VarScan).
The VCF file returned was given as a input to ENSEMBL's Variant Effect Predictor (VEP), and I plan to filtering the output so that it consists of SNPs annotated as deleterious.
I quickly examined the HTML file containing statistics (default output provided by VEP), and noticed that there were large number of overlapped genes/transcripts reported by the tool.
Should I be concerned with such large numbers? Is there something I am missing or should be looking out for? Any input would be greatly appreciated.