I am doing an RNA-seq analysis. I have two CSV files, and I want to get a set of data of my genes during P Values to know which genes in each CSV file are up regulated. EdgeR achieves the tables in differential gene expression in R studio, and the script is written down. I use the codes for both tables that I have for gene counts achieved by ht-seq. At frist i want to know which genes are common among two tables and then use a python script to build a set or table to tell me which genes in each table are up regulated and which are not from the mutual ones that i find. The P value that i consider is < 5% . How can I do it by python? Also there is a point that i have to mention that the genes orders(first column) in the tables are not the same and they are different!
library(edgeR) group <- factor(c(rep("pro-vaccine",30),rep("pre_vaccine",20))) x<-read.csv("table_50_ID.csv") y<-DGEList(counts = x, group=group) y<-calcNormFactors(y) design<-model.matrix(~group) y<-estimateDisp(y,design) fit<-glmFit(y,design) lrt<-glmLRT(fit,coef = 2) deg <-topTags(lrt, n = Inf , p= 0.05)$table up <-deg[deg$logFC > 0,] down <-deg[deg$logFC < 0,] write.csv(up, file="up.csv") write.csv(down, file="down.csv")
Here is also my tables shapes in R studio:
Thanks in advance