An individual gene in different peak of GISTIC
2.2 years ago
I am tracking a gene in GISTIC from my analysis and a paper on the the cancer and basically the same raw data, but I am seeing some genes like KRAS, ERBB2, etc location on peaks are different in my GISTIC output and that paper

For instance in my data

KRAS chr12:25340002-26090000 12p12.1


In the paper

12p11.23 chr12:22440282-27280000


my -ta and -td in GISTIC has been 0.5

Actually I don't know how to explain this

Can you help me?

2.2 years ago

Hey A, all good?

There are many genes in that interval, including KRAS:

I do not understand what this is:? - 12p11.23 2.9593e-26 4.2718e-26 chr12:22440282-27280000

Kevin

Thank you I hope you are safe and good Sorry that was my typo My confusion is; what in GISTIC define the boundaries? I mean KRAS in my data and that paper located on different peaks

It is difficult to say... there are many factors to consider, but mainly the marker density. Are you sure that you have processed the data in the exact same way as the authors?

Actually I have prepared GISTIC input like markers file and segmentation by a script they sent to me

The only thing may differ is -ta and -td threshold for GISTIC

This is more and a month I'm trying different threshold but again I'm seeing non agreement

For instance I'm seeing amplification of an oncogene in responders to chemotherapy promoting cell cycle

This is against biology

Although in RNAseq I am also seeing the up regulation of this cell cycle promoting oncogene in responders to chemotherapy

This is really hard to detect where I'm doing something wrong

So, this relates to here: Explaining relation of WGS and transcriptome ?

It is possible to observe deletion and increased expression at the same time when you consider that these are bulk tumour biopsies; so, different cells have different profiles. The deletion does not have complete penetrance (is not present in all tumour cells), and the increased expression is neither present in all cells.

I think that it will be virtually impossible to perfectly agree with the data of the authors when you consider the fact that program/ package versions have likely changed, and even the source of the data may have changed or been updated. The TCGA data is constantly still being updated.

Thank you At some point yes this post is related to that link

Here my concern was a gene in different peaks and there non agreement of WGS and RNAseq with biology concept

Sorry @Kevin Blighe to be too rude in questioning

By your tutorial I mapped gene symbols to genomic aberration regions and I have some thing like below

SampleID    Chromosome  Start   End Total_CN    Minor_CN    Plodidy GeneSymbol  Chr Start   End
LP6005500-DNA_D03   1   10583   5726928 2   0   2.663336    DDX11L1 1   11869   14412
LP6008334-DNA_B02   1   10583   3457311 2   0   1.863911    DDX11L1 1   11869   14412
LP6008334-DNA_C02   1   10583   62271   2   0   3.979128    DDX11L1 1   11869   14412


My understanding is, minor copy number = 0 means lose of heterozygosity (LOH)

In the attached file I separated samples with minor copy number = 0, So if I am not wrong all these genes here have LOH , be tis seem weird

Can you help me with tuition?

https://www.dropbox.com/scl/fi/aqttu6fhteaomnon3msfv/Minor_Copy_number.xlsx?dl=0&rlkey=8xtx9o2lrix25x1m23b64b728

I see... that tutorial is my modified version of a tutorial produced by the TCGAbiolinks people.

In the attached file I separated samples with minor copy number = 0, So if I am not wrong all these genes here have LOH , be tis seem weird

A 0 would be full deletion, so, the DNA does not even exist in the sample (?)

Please go back to the start of your project and go 'step by step', analysing the inputs and outputs of each part.

I feel sorry that the TCGA copy number data, specifically, causes so much confusion in the community.

Can you help me with tuition?

This is difficult while remote, and while I have to do my other work for which I receive payment.

I see, thank you so much