Question: Low transition/transversion ratio: alignment or caller problem?
3
gravatar for DoubleD
5.2 years ago by
DoubleD130
United States
DoubleD130 wrote:

Hello,

After running Varscan and Mutect on a set of 10 patients (tumor / normal comparison), I have run through a pipeline of false-positive filtering.  When I look at my resulting Ts/Tv ratio (by manual calculation, snpEff summary file, SnpSift tstv calculation or GATK VariantEval), it is quite low for human whole genome sequence data (1.3-1.6).  I have read all I can find here and in papers about the expected ratio, and how a low ratio could denote a great deal of false positives.

I ran Varscan with relatively lax parameters for calling somatic mutations (5 reads in N, 8 in T, but strand bias filtered), however I thought Mutect would call a confident set.  Both SNP callers end up with a low Ts/Tv.  My question is, can I chalk this result up to false positives (which is okay with me, I wanted a sensitive not specific call set), or could it be a problem with the BAM alignment?  I suppose a poorly aligned BAM would lead to false positives too, but any insight or information would be greatly appreciated.

whole genome qc somatic vcf • 3.7k views
ADD COMMENTlink modified 5.2 years ago by Cyriac Kandoth5.3k • written 5.2 years ago by DoubleD130

Some followup information after talking with a more experienced user; running the TsTv ratio calculation on the germline calls results in 2.06.  Hopefully this denotes a properly aligned BAM, and the low ratio with somatic calls come from too sensitive calling parameters (too lax of parameters to call somatic).  The ratio on the LOH calls was 2.1, although there were far fewer calls compared to the germline file.

ADD REPLYlink written 5.2 years ago by DoubleD130
3
gravatar for Cyriac Kandoth
5.2 years ago by
Cyriac Kandoth5.3k
Memorial Sloan Kettering, New York, USA
Cyriac Kandoth5.3k wrote:

We should expect Ts/Tv ratio of somatic point mutations to be wildly variable across tumor types... depending on various mutagens, or the mechanisms involved in DNA repair. I can't seem to find a publication that confirms this assumption, but this figure comes close. Here are my quick and dirty Ts/Tv ratios of mutation calls grabbed from that paper, but please double-check my work.

Note: A caveat in the data below is that some cohorts are exomes while others are whole-genomes. Since there's more GC content in exomes, these Ts/Tv ratios are not perfectly comparable... but good enough for our point to hold.

Cancer Type Ts/Tv
ALL 0.949906
AML 2.128909
Bladder 1.325778
Breast 0.859808
Cervix 1.265049
CLL 1.006487
Colorectum 2.163191
Esophageal 1.38155
Glioblastoma 3.53876
Glioma Low Grade 2.244252
Head and Neck 1.172555
Kidney Chromophobe 2.545455
Kidney Clear Cell 1.165541
Kidney Papillary 1.116037
Liver 1.222369
Lung Adeno 0.439277
Lung Small Cell 0.569885
Lung Squamous 0.635106
Lymphoma B-cell 0.971431
Medulloblastoma 1.381825
Melanoma 8.54497
Myeloma 1.303654
Neuroblastoma 0.566366
Ovary 0.876746
Pancreas 1.021448
Pilocytic Astrocytoma 1.837178
Prostate 1.220668
Stomach 3.006267
Thyroid 2.161623
Uterus 1.632635
ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Cyriac Kandoth5.3k
1

This is very helpful, thank you Cyriac.  Using the data found at ftp://ftp.sanger.ac.uk/pub/cancer/AlexandrovEtAl/somatic_mutation_data/Liver/ I got a TsTv of 1.222 for the 850734 SNPs in that project.

For my dataset, calculating a TsTv on the germline mutations gives a result of 2.1, denoting mutation without selection, but the somatic TsTv of 1.3 to 1.5 would denote a selective mutation pressure.

ADD REPLYlink written 5.2 years ago by DoubleD130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1468 users visited in the last hour