Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lncRNAs remains a challenge. Recent advances in RNA sequencing (RNA-Seq) and computational methods allow for an unprecedented analysis of such transcripts. Our catalogue unifies previously existing annotation sources with transcripts we assembled from RNA-Seq data across human 24 tissues and cell types.
We want to find that lncRNA expression is strikingly tissue specific compared to coding genes. I'm using JS divergence to evaluate the tissue specificity. Recently, I read a paper "Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses". However, I don't know how to calculate for python code as follows:
import os from scipy.stats import entropy from numpy.linalg import norm import numpy as np def JSD(P, Q): _P = P / norm(P, ord=1) _Q = Q / norm(Q, ord=1) _M = 0.5 * (_P + _Q) return 0.5 * (entropy(_P, _M) + entropy(_Q, _M))