R object can allow duplicated column name and storing different types of values
0
0
Entering edit mode
6 weeks ago
cwwong13 • 0

Indeed, the question has two parts but one goal:

To export a gene expression matrix from R that may contain duplicated column names with the top-left corner cell has a "column name" for the "row name".

I would like to export the matrix/ dataframe/ tibble (if any of these objects are compatible with goal) have the following structure:

Symbol tissue1 tissue2 tissue3 tissue2 tissue3
Abc 1 2 3 4 5
Bcd 2 3 4 5 6
Cfd 3 3 3 3 3
...


The row names are unique. Only the colname might be duplicated. and I also would like the exported txt/ csv file have a name for the rowname ("Symbol" in the example above). Indeed, I also open to other program languages (e.g. python), given I can easily import the data and do some simple data cleaning.

I found that matrix can store duplicated column name, but I cannot add the column name for the row name while tibble can have column name for the row name, but not duplicated column names.

Thanks!

R python • 147 views
1
Entering edit mode

see if this is what you want:

data(iris)
names(iris)
names(iris)[1:4]=c(rep("Petal.Width",4))
library(magrittr)
library(tibble)

iris=iris %>%
as.tibble(., .name_repair="minimal")  %>%
rownames_to_column(., var = "test")

write.csv(iris,"test.txt", row.names = F)


First four columns have same name. Both writing and loading can be done. However, downstream processing may be difficult with identical named columns.

0
Entering edit mode

Thanks for pointing out the .name_repair option. That works perfectly!

0
Entering edit mode

You may have to customize the R matrix or tibble class to allow the feature you need. This will break many downstream functions though, so watch out for that.