Trying to read multiple CSV files in a folder and add filename to each column in R
1
0
Entering edit mode
7 weeks ago
mxz707 • 0

Hello Everyone,

I have multiple csv files (27 files of extracted features of MRI patients). Each csv file has different number of rows but the same set of columns. I am going to read each file and try to add file name to each columns' names. For instance, for file 27 with "AX_FLAIR-parcellationROI_27" name, I want to add "AX_FLAIR-parcellationROI_27" to each columns' names. To work on all the files at the same time , I put all files in a list and use a loop to read each file. I am using "paste" function in order to add new string to each column, but I got an error.

files <- list.files(pattern = "/Users/mostafa/Documents/Mostafa/UM/Papers/Data/TCIA/TCGA-GBM/RadiomicsData_EC/TCGA-02-0009/RadiomicsFeatures-csv/AX_FLAIR")

for(f in files){
}


Is there a way to fix this last piece of code? Thanks.

R Merging • 768 views
0
Entering edit mode

There is easier way (works in zsh and bash)

$for i in *.csv; do awk 'NR==1{gsub(/,|$/,"_"FILENAME","); sub (/,$/,"")}1'$i > ${i%\.csv}"_new.csv"; done  new files will have "new.csv" extension. It helps if you could post data. Hopefully, text within quotes, do not have comma inside them. ADD REPLY 0 Entering edit mode Thanks for you guidance. Is your command in R? If not, how can I do it in R? ADD REPLY 0 Entering edit mode Try this in R: files = list.files(pattern = ".csv", path = "~/Desktop/") for (i in files){ iname=sub("\\.csv","",i) f=read.csv(i, header = T) colnames(f)=paste(colnames(f),iname,sep = "_") assign(paste0(iname,"_new"),f) } ls()  This would add file name to each column and all new files are stored in R as objects for further manipulation. ls() would list the objects and objects with "_new" are the objects with filename appended in all columns. change the path in code, as per your data location on local machine ADD REPLY 0 Entering edit mode Great!! I tried it, but I got the following error: Error in file(file, "rt") : cannot open the connection  How can I deal with that? PS: I am using a mac to run the code. Maybe there is a problem with this command: iname=sub("\\.csv","",i)  ADD REPLY 1 Entering edit mode Please post the code you have used and also tree of the directory, from where the code is used. print getwd() in R to get active directory. ADD REPLY 0 Entering edit mode Thanks, the issue is solved!! ADD REPLY 1 Entering edit mode it would be helpful for others if you could post how the issue is resolved. ADD REPLY 0 Entering edit mode Here is how the issue is resolved: files<- list.files(path = "/Users/mostafa/Documents/Mostafa/UM/Papers/Data/TCIA/TCGA-GBM/RadiomicsData_EC/TCGA-02-0009/RadiomicsFeatures-csv/AX_FLAIR_Copy", pattern=".csv", full.names=T) for (i in files){ iname=sub("\\.csv","",i) f=read.csv(i, header = T) colnames(f)=paste(colnames(f),iname,sep = "_") assign(paste0(iname,"_new"),f) } ls()  ADD REPLY 0 Entering edit mode 7 weeks ago Sam ★ 3.8k You want to do colnames(ReadIFile)=paste0(colnames(ReadInFile),"_AX_FLAIR-parcellationROI_",f) Though your file format will be horrible and will not be easy to analyze. The easier way might be to add an additional column to your data frame indicating which file the data is from files <- list.files(pattern = "/Users/mostafa/Documents/Mostafa/UM/Papers/Data/TCIA/TCGA-GBM/RadiomicsData_EC/TCGA-02-0009/RadiomicsFeatures-csv/AX_FLAIR") res <- NULL for(f in files){ ReadInFile <- read.csv(file=f, header=T, na.strings="NULL") ReadInFile$File <- f
res <- rbind(res, ReadInFile)
}

0
Entering edit mode

Thanks, Sam. I see your concerns. I know it might be horrible, but this is the one approach that I am thinking about to create a matrix data of extracted features. Now, I am trying to see if it is doable or not. If not, I will drop it and think about another approach. About the correction that you sent me, I tried, but I got the following error:

Error in file(file, "rt") : cannot open the connection


How should I fix that? Thanks for you help.

0
Entering edit mode

Try doing

files <- list.files(path = "/Users/mostafa/Documents/Mostafa/UM/Papers/Data/TCIA/TCGA-GBM/RadiomicsData_EC/TCGA-02-0009/RadiomicsFeatures-csv", pattern="AX_FLAIR", full.names=T)

0
Entering edit mode

Thanks, Sam!! I changed your command a little bit and it worked perfectly.

files2 <- list.files(path = "/Users/mostafa/Documents/Mostafa/UM/Papers/Data/TCIA/TCGA-GBM/RadiomicsData_EC/TCGA-02-0009/RadiomicsFeatures-csv/AX_FLAIR_Copy", pattern=".csv", full.names=T)