Entering edit mode
                    3.0 years ago
        Rob
        
    
        ▴
    
    180
    Hi friends
I am using the ...workflow to download & merge the TCGA-KIRC RNAseq
https://brh.data-commons.org/dashboard/Public/notebooks/GDC_TCGA-CHOL_RNA_analysis_BRH_040722.html
However, in part of the code for merging I get an error.
Here is the chunk of code:
# Define the function to merge all RNAseq quantification files into one datadrame
merge_rna <-function(metadata, fdir){
    filelist <- list.files(fdir, pattern="*.tsv$", 
                        recursive = TRUE, full.names=TRUE)
    for (i in 1:length(filelist)){
        iname <- basename(filelist[i])
        isamplename <- metadata[metadata$file_name==iname, "sample"]
        idf <- read.csv(filelist[i], sep="\t", skip=1, header=TRUE)
        # remove first 4 rows
        remove <- 1:4
        idf_subset <- idf[-remove, c("gene_id","unstranded")]
        rm(idf)
        names(idf_subset)[2] <- isamplename
        #print(dim(idf_subset))
        if (i==1){
            combined_df <- idf_subset
            rm(idf_subset)
        } else {
            combined_df <- merge(combined_df, idf_subset, by.x='gene_id', by.y="gene_id", all=TRUE)
            rm(idf_subset)
        }
    }
    # remove certain gene ids
    combined_df <- combined_df[!(grepl("PAR_Y", combined_df$gene_id, fixed=TRUE)),]
    # modify gene_id
    combined_df$gene_id <- sapply(strsplit(combined_df$gene_id,"\\."), `[`, 1)
    # use gene_id as row names and remove gene_id column
    rownames(combined_df) <- combined_df$gene_id
    combined_df <- combined_df[,-which(names(combined_df) %in% c("gene_id"))]
    return(combined_df)
}
rnaCounts <-  merge_rna(metaMatrix.RNA, "TCGA-KIRC/RNAseq") # Iget error here
rnaCounts[1:5,]
The error is :
Error in names(idf_subset)[2] <- isamplename : 
  replacement has length zero
It will be very helpful if anyone can help me fixing this error and merging data.
Thanks