How to handle multiple mapping of Gene-Symbols and Probe-ID in Micro-array Data
1
0
Entering edit mode
3.9 years ago
sp29 ▴ 50

After, performing the differential analysis with limma. After, mapping with the feature data, I have got the data frame as follows-

FDR Probe_ID Gene.Symbol Gene.ID

0.009 1555272_at RSPH10B2///RSPH10B 728194///222967

0.007 1557203_at PABPC1L2B///PABPC1L2A 645974///340529

0.007 1557384_at LOC100506639///ZNF131 100506639///7690

The code for making the above df in R is as follows-

df <- data.frame( FDR = c (0.009, 0.007, 0.007), Probe_ID = c("1555272_at", "1557203_at", "1557384_at"), Gene.Symbol = c("RSPH10B2///RSPH10B","PABPC1L2B///PABPC1L2A","LOC100506639///ZNF131"), Gene.ID = c("728194///222967","645974///340529","100506639///7690"))

I want to perform a GSEA using the column df$Gene.Symbol. However, I can see that more than one gene-symbol is mapped with the one Probe-ID, for which I split the whole data frame by-

df_split <- as.data.frame(df %>% separate_rows(Gene.Symbol, Gene.ID, sep = "///"))

But got repetitive gene symbols. What should be the correct way to resolve this and go about just annotating the df$Gene.Symbol with non-repetitive gene symbols. I don't want to use any online tool as I am hard coding the micro-array pipeline as a part of my project.

R Micro-Array Data-Frame Probe-ID Gene • 704 views
ADD COMMENT
0
Entering edit mode
3.8 years ago
sp29 ▴ 50

# Merging

annotated <- as.data.frame(annotated %>%

group_by(Gene.symbol) %>%

filter(across(c("logFC"), ~ n_distinct(sign(.)) == 1)) %>%

summarise(across(c("logFC","P.Value","adj.P.Val","B","AveExpr","t"), mean), X = str_c(X, collapse= " | "),

Gene.title = str_c(Gene.title, collapse= " | "), Gene.ID = str_cGene.ID, collapse= " | "),

GenBank.Accession = str_c(GenBank.Accession, collapse= " | ")))

ADD COMMENT

Login before adding your answer.

Traffic: 2934 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6