Question: Somatic vs Germline Variant Calling
0
gravatar for checkyodna
8 months ago by
checkyodna0
checkyodna0 wrote:

Hi,

I was wondering what exactly is the difference between germline and somatic variant calling?

Why is a tumor-normal study considered to be somatic?

Can indels be picked up on germline or somatic samples?

Thanks

ADD COMMENTlink modified 8 months ago by RamRS19k • written 8 months ago by checkyodna0
2
gravatar for RamRS
8 months ago by
RamRS19k
Houston, TX
RamRS19k wrote:

I'm going to take a shot at answering this:

Somatic variants = variants seen in a somatic cell not seen in other somatic cells. Somatic cells are not inherited by offspring. Germline variants = variants seen in germline cells that are passed on to offspring

Somatic and germline, as you can see, are based on which cell the genetic material is extracted from. Indels are a type of mutation that can occur anywhere.

Tumor normal is considered somatic as cancer is usually an abnormality in the somatic DNA, corrupting one particular cluster of cells in your body, rendering them with a different genotype than the rest of your body.

ADD COMMENTlink written 8 months ago by RamRS19k
1
gravatar for WouterDeCoster
8 months ago by
Belgium
WouterDeCoster35k wrote:

I was wondering what exactly is the difference between germline and somatic variant calling?

Germline variants are either diploid/biallelic, so expected alternative allele frequency is 50% for a heterozygous position. Somatic variants depend on the tumor purity and are not present in all cells tested. As such variant allele frequencies can be much lower.

Why is a tumor-normal study considered to be somatic?

Because you are looking for differences between the tumor and the normal sample, and therefore variants which are not part of the germline but appeared somatically.

Can indels be picked up on germline or somatic samples?

Yes.

ADD COMMENTlink modified 8 months ago • written 8 months ago by WouterDeCoster35k
2

In simpler terms, germline variants are variants that are inherited by from the parents via the germ cells, so sperm and oocytes, means the variant has already been present in the genome of at least one of the the parents. Somatic variants arise de novo in the genome of the respective individual. Example: A variant that occurs in a stem cell will be found in all offspring cells that derive from that stem cells, but not in all the other cells of the organism. In order to distunguish germline from somatic, one sequences the tumor sample and a matched-normal. E.g. in case of lung cancer, one takes the tumor biopsy from the lung, and a matched-normal from the blood. Even though germline variants (risk factor variants) can contribute to pathogenesis, somatic variants are typically more involved a diseases, that is why they are of special interest.

Without a matched-normal control, one could not distinguish between somatic and germline, because every genome contains tens of thousands of mutations towards the reference genome, so a matched-normal from the same donor is necessary.

Indels are simply additional of missing nucleotides in comparison to a reference, therefore they can be found in both germline and somatic.

ADD REPLYlink modified 8 months ago • written 8 months ago by ATpoint11k
2

You should change "inherited by the parents" to "inherited from the parents".

ADD REPLYlink written 8 months ago by genomax59k

As a non-native speaker, where is the difference?

ADD REPLYlink written 8 months ago by ATpoint11k

Inherited by parents would indicate inheritance from grand-parents, (which is probably true if these mutations are being transmitted through multiple generations).

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax59k

Aside from the conceptual answers on what is somatic and germline, I would very much like to know how a software can tell apart germline and somatic. I know that tools are specific for either somatic or germline variant calling, but other than that what are the basic assumptions that these tools rely on?

ADD REPLYlink written 9 weeks ago by leaodel10

Germline: compare child sequence to parents' sequences, infer which allele the child inherited from each parent. Possible alternatives: de novo mutation, mendelian abnormality

Somatic: compare tumor sequence (on sequence from any one cell) from individual to sequence from normal tissue (or any other cell type) from the same individual. Exclude all variants seen in the germline. Those are somatic variants specific to the first (tumor) cell.

Tools can't tell things apart, tools are just software and software is dumb. Experiments need to be designed so the results put in context make sense.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by RamRS19k

Thanks, RamRS. I do understand that 'software is dumb' but there must be some underlying assumption to call either somatic or germline variants other than experiment design. Otherwise, we wouldn't need distinct software (I.e., HaplotypeCaller for germline and Mutect2 for somatic, both from GATK). Plus it is not always the case that variant calling will be your end goal and so your study design might not be 'gold-standard designed' to perform variant calling. I would appreciate other inputs with reasons other than the experimental design to tell apart somatic and germline variant calling.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by leaodel10

Wouter's answer pretty much addresses your question with its take on ploidy and tumor purity. See Mutect2's page for a similar description.

Honestly, most cancer genomic studies involve matched normal samples, so I don't see why a study would need to be "gold standard designed for variant calling" to contain matched normals.

ADD REPLYlink written 9 weeks ago by RamRS19k

I agree with you. I was referring to germline variant call, should've made that clear. My understanding is that you'd need child-parent comparison if you want to find new variants or in a clinical setup. I just have some RNA-seq samples and I want to compare their SNPs. Bottom line, the basic assumption would be the frequency of the variant, right?

ADD REPLYlink written 9 weeks ago by leaodel10
1

You'd need parents (and ideally a normal non-phenotypic sibling) for germline experiments, yes. That would enable discovery of transmitted and de novo variants, along with refined attribution of phenotypes (although I don't know tools that take normal siblings into account).

I'm not conversant with working on variants found in RNA-seq, so someone else will have to give you feedback on your statement on variant frequency.

ADD REPLYlink written 8 weeks ago by RamRS19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1724 users visited in the last hour