Validation of somatic variant calling pipeline
3
0
Entering edit mode
2.7 years ago
Elisa • 0

Hi everyone,

I've implemented a somatic variant calling pipeline in order to detect somatic variants in ovaric tumour samples (Targeted Sequencing done by MiSeq, Illumina). I now need to validate this pipeline; at this aim I'd like to know which are the best datasets to use in order to benchmark. Moreover, in order to evaluate the perfomances of my pipeline against other pipelines which are the best metrics to calculate and compare ? Which tool shall I use to compare the performances of 2 different variant calling pipelines (so to compare 2 different VCF files)?

Thanks in advance

calling somatic validation variant • 1.3k views
ADD COMMENT
1
Entering edit mode
2.7 years ago

Hi Elisa,

At the most simple level, you could use different somatic variant callers on the same dataset, and then derive test statistics from that. For example, use 3 somatic variant callers, create a 'gold standard' dataset that has variants called in all 3 variant callers, and then derive test statistics for each caller (sensitivity, specificity, precision, accuracy) by comparing these to the gold standard..

Kevin

ADD COMMENT
0
Entering edit mode
2.7 years ago

I would suggest using germline sequencing data from two individuals, and mixing a low percentage of reads of individual B with individual A. Variant private to individual B should show up as somatic variants when analyzing the combined data.

ADD COMMENT
0
Entering edit mode
2.7 years ago

You can use hap.py for benchmarking somatic calls as well as germline. We are developing a benchmarking framework at Truwl.com to automate this process. There have been a number of somatic benchmarking papers, some of which have included gold standard calls. If you can get your pipeline into WDL we'd be happy to work with you to evaluate it. Just drop us a line.

ADD COMMENT

Login before adding your answer.

Traffic: 2956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6