I have some samples in a data frame with different metrics. They were repeatedly sequenced in the lab to get better QC metrics.
When I read in the data frame using
df <- read.csv("samples.csv")
R assigns the unique identifier for the same column names. Like this,
SMP000113706 SMP000113706.1 SMP000113707 SMP000113707.1
I want to keep only the duplicated columns, and not any of the unique columns. But I have almost 190 columns in a 500 column data frame with diff numbers and unique identifier, there is no pattern to it. How do I retain only the duplicated columns.
SMP000114738 SMP000114739 SMP000114740 SMP000114741 SMP000114982 SMP000114982.1 1.217835036 1.2085439 2.81750655 1.5034578 0.000214017 0.000224536 1.217835036 1.2085439 2.81750655 1.5034578 0.000214017 0.000224536 0.007330334 0.1168343 0.02292839 0.3406125 0.348659681 0.425420762
In this I want to retain only SMP000114982 and SMP000114982.1
Thank you for the help.
I'd probably just do