I am very new to bioinformatics and so far belong to pure computer science domain. I know only secondary school level biology, but am trying to learn a lot. Please bear with me if my question is foolish/ rubbish/ridiculously simple to other folks.
I would like to estimate the level of coexpression between two lncRNAs (for example HOTAIR and MALAT1) using PCC. The reference dataset is from NONCODE, which gives the expression profiles of hundreds of lncRNAs in 24 different cells/tissues as a vector. But for each lncRNA -name, NONCODE gives a number of isoforms (with distinct NONCODE id), each with a different expression value in each of these 24 different cells/tissues. I am clueless about which isoform is relevant to my studies. It seems illogical to go for a random selection as the expression values varies hugely from isoform to isoform. How to resolve this issue??
Instead, if I go for the gene(namely HOTAIR) by using the NONCODE gene ID, corresponding to the query lcRNA (HOTAIR), it is unique. Will it be fair to take the gene expression profile data to estimate the coexpression of lncRNAS which are part of the respective genes, bearing the same name??