Question: Rank Normalization of genes in Expression Data
0
4.5 years ago by
Ron990
United States
Ron990 wrote:

Hi all,

I want to do RANK based normalization in RNA seq expression data,basically ranking genes in a sample according to FPKMS and dividing by the total number of genes to get a normalized expression value  between 0 and 1.

This has to be done across multiple samples.Is there any method to do this?

I have used this method,seems to work,but want to look at rank based method.

```gene_min=apply(df, 1, min)
gene_max=apply(df, 1, max)
df_norm=(df-gene_min)/(gene_max-gene_min)​```

Thanks,

Ron

rna-seq R • 3.9k views
modified 4.5 years ago by Kamil2.0k • written 4.5 years ago by Ron990

Hi Ron, what have you tried. A very simple method will be to apply a small filter (threshold) on the FPKM values, scale the genes between 0-1 and then just sort, very easy to do in R. Let us know, if you get stuck, I can write a function for you.

I have added the method I am using,which is a bit different although it scales values between 0 and 1.

4
4.5 years ago by
Kamil2.0k
Boston
Kamil2.0k wrote:

Perhaps you're trying to do something like this?

```> mat
1        2        3        4        5
2  6.890809 6.744169 6.642575 6.649212 6.756785
9  4.303356 4.250599 4.245089 4.193621 4.471561
10 3.739968 3.823797 3.942015 3.850949 3.699985
12 8.237043 8.233598 8.315632 8.354951 8.472915
13 3.051626 2.983962 2.997821 3.017578 2.966767

> apply(mat, 2, function(y) rank(y) / length(y))

1   2   3   4   5
2  0.8 0.8 0.8 0.8 0.8
9  0.6 0.6 0.6 0.6 0.6
10 0.4 0.4 0.4 0.4 0.4
12 1.0 1.0 1.0 1.0 1.0
13 0.2 0.2 0.2 0.2 0.2```