Question: Somatic Mutation Identification for Tumor without Normal
1
gravatar for haiying.kong
2.3 years ago by
haiying.kong290
Germany
haiying.kong290 wrote:

I have WES for 6 tumors which do not have matching normals. I identified all mutations, both germline and somatic with GATK. Then excluded all mutations that are considered as germline by ExAC, and germline mutations that are identified from our about 90 normal samples. The number of somatic mutations identified in this way for the 6 tumors is 251495. I did expect this number could be high, but did not expect to be this high.

The number of somatic mutations identified with GATK for tumors with matching normals is 3 or 4 digits for one tumor.

Even with most conservative estimate, the mutation in the tumors without matching normal is 10 times higher.

Is there anyway I can still use the 6 tumors?

somatic mutation • 1.7k views
ADD COMMENTlink modified 16 months ago by Biostar ♦♦ 20 • written 2.3 years ago by haiying.kong290

What histology are your tumors? Melanoma, RCC, etc.? The magnitude of mutational load, whether wrong or not at this juncture, will be somewhat more predictable knowing this.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by CMosychuk20
1

primary melanoma whole exome sequence.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by haiying.kong290
0
gravatar for Manuel Landesfeind
2.3 years ago by
Göttingen, Germany
Manuel Landesfeind1.2k wrote:

For me the number of mutations in your tumors seems suspiciously high. We do a very similar approach in removing potential germline mutations from tumors lacking matched normal samples (using different annotation databases and no internal normal-pool). In our models we usually retain between 500 and 2000 mutations (of course with exceptions of highly mutated tumors).

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Manuel Landesfeind1.2k

Could you please tell me which database you use as germline mutation list?

ADD REPLYlink written 2.3 years ago by haiying.kong290
1

You can have a look to gnomAD which is bigger and include ExAC : http://gnomad.broadinstitute.org/ .

ADD REPLYlink written 2.3 years ago by Titus900
2

What can be discussed is the threshold you use: when do you consider a variant as putative germline? 1%, 10%, ... allelic frequency? In any population or overall?

PS: If you figure out a good threshold or some literature, please share here because it will save me some time to evaluate this by myself in the near future ;-)

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Manuel Landesfeind1.2k
1

I absolutely agree with Titus that gnomAD and ExAC are key databases. We use 1000G, HapMap and others because our established pipeline uses hg19 genome assembly.

Moving to hg38 and corresponding databases is in progress and I plan to use gnomAD and ExAC too.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Manuel Landesfeind1.2k
1

Dear Manuel,

The website is saying "What genome build is the gnomAD data based on? All data are based on GRCh37/hg19"

http://gnomad.broadinstitute.org/faq

ADD REPLYlink written 2.3 years ago by haiying.kong290

Thats interesting... for some reason I thought it would be hg38... don't know why..

ADD REPLYlink written 2.3 years ago by Manuel Landesfeind1.2k

Could you please give me complete list of database you use as reference for germline mutations for hg19/hg37? It would be wonderful if you could give me links to download as well :)

ADD REPLYlink written 2.3 years ago by haiying.kong290
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2204 users visited in the last hour