Reducing False Positive Indel Calls From Pindel

3

Entering edit mode

12.4 years ago

Rlong ▴ 340

I am analyzing the specificity of Pindel's indel calls. I have matched tumor/normal WGS data, bwa aligned. The original calls were filtered to remove any that had supporting reads from the normal sample to filter out germline calls. Then I take the remaining calls and tier them based on whether they occur in coding regions or not. Then the calls from these coding regions were validated using an orthogonal sequencing method.

Is there a short list of easily determined metrics to check for correlation with false-positive calls? I am considering gathering a bunch of metrics on these calls and tossing them all into a machine learning app like Weka to see if it finds anything, but I would like to add as many meaningful data-points to correlate as possible.

indel variant pindel somatic • 3.5k views

ADD COMMENT • link updated 5.0 years ago by isaacpei • 0 • written 12.4 years ago by Rlong ▴ 340

0

Entering edit mode

perhaps try using bam-readcounts, which can report some metrics regarding the read: http://hpc.mskcc.org/compute-accounts/account-request/

Machine learning related: perhaps try: https://github.com/google/deepvariant

ADD REPLY • link 5.0 years ago by isaacpei • 0

0

Entering edit mode

I moved this to a comment as neither of the two suggestions is directly related to the question.

ADD REPLY • link 5.0 years ago by ATpoint 82k

Login before adding your answer.