Question: How to detect cis and trans protein coding genes of interested lncRNAs?
0
gravatar for newbie
7 months ago by
newbie90
newbie90 wrote:

Hello Everyone,

I have 10 tumor samples and I'm interested in identifying lncRNA function. I see that for WGCNA, minimum sample size should be 20.

So, to identify the lncRNAs function - I initially wanted to detect cis and trans protein coding genes of my interested lncRNAs and do a Pearson / spearman correlations with those genes and select genes with p-value < 0.05 and correlation coefficient ≥ 0.5 and do a pathway analysis.

Are there any tools through which I can get the cis and trans genes of lncRNAs? If yes, could you please show Mme any example how to get that?

Any help is appreciated.

lncrna rna-seq chip-seq wgcna • 209 views
ADD COMMENTlink modified 7 months ago by i.sudbery10k • written 7 months ago by newbie90
0
gravatar for i.sudbery
7 months ago by
i.sudbery10k
Sheffield, UK
i.sudbery10k wrote:

Unfortunately there is no way to do this bioinformatically that I am aware of for the majority of lincRNA dependent regulatory mechanisms. Unless you are super luckly and someone has done a ChRIP-seq, CHART-seq or deletion experiment already.

ADD COMMENTlink written 7 months ago by i.sudbery10k

Just now I see that in this paper Long non-coding RNAs defining major subtypes of B cell precursor acute lymphoblastic leukemia they have detected like following:

Functional predictions using guilt-by-association approach In our study, we used the “guilt-by-association” approach by establishing the pairwise expression correlations between DE lncRNAs (from all BCP-ALL subtypes) and its cis and trans protein-coding (PC) genes in order to predict the functions of subtype-specific lncRNAs. We determined the cis and trans PC genes of DE lncRNAs using the GREAT tool (version v3.0.0). All PC genes from GENCODE v19 annotation (n = 20,698) were used in the analysis. The individual cis and trans genes for each DE lncRNAs were located within a genomic window of 100 kb and greater than 100 kb, respectively. From each dataset, we then computed the pairwise expression correlation using Pearson correlation method between each lncRNAs and its cis and trans coding gene. The significantly co-expressed PC genes (Pearson correlation coefficient ≥ 0.55 and two-tailed P value ≤ 0.05) were further used for functional enrichment analysis using GeneSCF v1.0. The functional enrichment analysis was performed using the KEGG database with a background of all protein-coding genes from GENCODE v19 [34] (20,345). The functional terms were considered significant only if it is enriched with P value ≤ 0.05.

Do you think this is right way to detect?

ADD REPLYlink modified 7 months ago • written 7 months ago by newbie90

This is just a poor man's WGCNA.

WGCNA is just pearson correlation, but with with a non-linear transformation applied to the correlation coefficient.

If I were a reviewer I would want to see: * A negative control - what happens if you shuffle the sample labels for the lncRNA measurements, while keeping the sample labels for the protein coding genes the same. * Make sure you calculate the FDR. You could either use shuffling as above (ideally) or you could use BH. * A recognition that correlation does not imply causation - how do you know you are finding protein coding genes that are regulated BY the lincRNA, and not protein coding gense that regulate the lincRNA. You are generating hypotheses here, rather than testing them.

ADD REPLYlink written 7 months ago by i.sudbery10k

I actually have 10 tumor and 10 normal samples. Can I use all these 20 samples for WGCNA? I'm interested in finding function of genes related to tumor samples.

1) In case if I use both tumor and normal for WGCNA, how do I know the interested module genes and their pathways are related to tumor?

ADD REPLYlink written 7 months ago by newbie90

You don't. As I pointed out above, co-expression analysis, and by extension WGCNA is only ever a hypothesis generating techinque. This applies whether use only tumour samples, only normal samples, or both.

If you find modules containing well known pathway pathways, and also you lincs on interest, you've generated a hypothesis that these lincs are connected to that pathway, but you are a long way from proving it.

ADD REPLYlink written 7 months ago by i.sudbery10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1249 users visited in the last hour
_