Question

How to detect cis and trans protein coding genes of interested lncRNAs?

0

Entering edit mode

3.9 years ago

newbie ▴ 120

Hello Everyone,

I have 10 tumor samples and I'm interested in identifying lncRNA function. I see that for WGCNA, minimum sample size should be 20.

So, to identify the lncRNAs function - I initially wanted to detect cis and trans protein coding genes of my interested lncRNAs and do a Pearson / spearman correlations with those genes and select genes with p-value < 0.05 and correlation coefficient ≥ 0.5 and do a pathway analysis.

Are there any tools through which I can get the cis and trans genes of lncRNAs? If yes, could you please show Mme any example how to get that?

Any help is appreciated.

RNA-Seq ChIP-Seq lncrna wgcna • 1.0k views

ADD COMMENT • link updated 3.9 years ago by i.sudbery 19k • written 3.9 years ago by newbie ▴ 120

score 0 · Answer 1 · 2020-06-04

0

Entering edit mode

3.9 years ago

i.sudbery 19k

Unfortunately there is no way to do this bioinformatically that I am aware of for the majority of lincRNA dependent regulatory mechanisms. Unless you are super luckly and someone has done a ChRIP-seq, CHART-seq or deletion experiment already.

ADD COMMENT • link 3.9 years ago by i.sudbery 19k

0

Entering edit mode

Just now I see that in this paper Long non-coding RNAs defining major subtypes of B cell precursor acute lymphoblastic leukemia they have detected like following:

Functional predictions using guilt-by-association approach In our study, we used the “guilt-by-association” approach by establishing the pairwise expression correlations between DE lncRNAs (from all BCP-ALL subtypes) and its cis and trans protein-coding (PC) genes in order to predict the functions of subtype-specific lncRNAs. We determined the cis and trans PC genes of DE lncRNAs using the GREAT tool (version v3.0.0). All PC genes from GENCODE v19 annotation (n = 20,698) were used in the analysis. The individual cis and trans genes for each DE lncRNAs were located within a genomic window of 100 kb and greater than 100 kb, respectively. From each dataset, we then computed the pairwise expression correlation using Pearson correlation method between each lncRNAs and its cis and trans coding gene. The significantly co-expressed PC genes (Pearson correlation coefficient ≥ 0.55 and two-tailed P value ≤ 0.05) were further used for functional enrichment analysis using GeneSCF v1.0. The functional enrichment analysis was performed using the KEGG database with a background of all protein-coding genes from GENCODE v19 [34] (20,345). The functional terms were considered significant only if it is enriched with P value ≤ 0.05.

Do you think this is right way to detect?

ADD REPLY • link 3.9 years ago by newbie ▴ 120

2

Entering edit mode

This is just a poor man's WGCNA.

WGCNA is just pearson correlation, but with with a non-linear transformation applied to the correlation coefficient.

If I were a reviewer I would want to see: * A negative control - what happens if you shuffle the sample labels for the lncRNA measurements, while keeping the sample labels for the protein coding genes the same. * Make sure you calculate the FDR. You could either use shuffling as above (ideally) or you could use BH. * A recognition that correlation does not imply causation - how do you know you are finding protein coding genes that are regulated BY the lincRNA, and not protein coding gense that regulate the lincRNA. You are generating hypotheses here, rather than testing them.

ADD REPLY • link 3.9 years ago by i.sudbery 19k

0

Entering edit mode

I actually have 10 tumor and 10 normal samples. Can I use all these 20 samples for WGCNA? I'm interested in finding function of genes related to tumor samples.

1) In case if I use both tumor and normal for WGCNA, how do I know the interested module genes and their pathways are related to tumor?

ADD REPLY • link 3.9 years ago by newbie ▴ 120

0

Entering edit mode

You don't. As I pointed out above, co-expression analysis, and by extension WGCNA is only ever a hypothesis generating techinque. This applies whether use only tumour samples, only normal samples, or both.

If you find modules containing well known pathway pathways, and also you lincs on interest, you've generated a hypothesis that these lincs are connected to that pathway, but you are a long way from proving it.

ADD REPLY • link 3.9 years ago by i.sudbery 19k