Question: Have to save the Correlation coefficient and p-value of different gene expression profiles to a dataframe in python?
1
gravatar for e.mohammadi.as
12 months ago by
e.mohammadi.as10 wrote:

Hi everyone, I am almost new to programming so I have had a simple problem which was hard to resolve for me. I am trying to get the pearson/spearman/kendall correlation coefficient for 12328 genes and 900 conditions. I can do that in R but I would like to do that in python. First column represents conditions which are different treatments on the same cancer cell line and other columns are gene expression profiles. So, conditions in rows and genes in columns. I used this code to calculate both correlation and P-value for each pair of genes

import pandas as pd
import numpy as np
import scipy
from scipy.stats import pearsonr
from scipy.stats import spearmanr
from scipy.stats import kendalltau

LFC_t=pd.read_csv("book1_t.csv")
column_list= LFC_t.columns
df_out=pd.DataFrame()
c=1
d=1
while c< 12328:
    while d<12328:
        g1=LFC_t[column_list[c]]
        g2=LFC_t[column_list[d]]
        p_r, p_p = pearsonr (g1, g2)
        d=d+1
        df_out=pd.merge[p_r, p_p]
        #df_out=p_r.append(p_p)
    c=c+1

As you can see, I can compute both correlations (p_r) and p-values (p_p) for each pairs of genes but I do not know how to save it in a DataFrame. Because the data for each new pairs would be over-wrighted on the previous data.

Also I need to have a file like this:

First and second column are gene pairs and third and fourth columns are related correlation and p-values, respectively

Thank you very much in advance.

ADD COMMENTlink modified 12 months ago by Mensur Dlakic8.2k • written 12 months ago by e.mohammadi.as10

I tried to properly embed your image but imgshare does not seem to work well. Please try a hoster such as imgbb and then use the full link to the image including the suffix, e.g. .jpg.

ADD REPLYlink written 12 months ago by ATpoint44k
0
gravatar for Mensur Dlakic
12 months ago by
Mensur Dlakic8.2k
USA
Mensur Dlakic8.2k wrote:
p_r_list = []
p_p_list = []
while c< 12328:
    while d<12328:
        g1=LFC_t[column_list[c]]
        g2=LFC_t[column_list[d]]
        p_r, p_p = pearsonr (g1, g2)
        p_r_list.append(p_r)
        p_p_list.append(p_p)
        d=d+1
    c=c+1

df_out['p_p_values'] = p_p_list
df_out['p_r_values'] = p_r_list
df_out.to_csv('choose_file_name.csv', index=False)
ADD COMMENTlink written 12 months ago by Mensur Dlakic8.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2397 users visited in the last hour
_