Hello all,
I have cancer genomic data (tumor/normal whole exome sequencing) from 50 patients that received the same type of treatment, half of whom responded. These come in the form of 50 .maf files, along with a supplemental file that, along with some other fields, has the Response (Responder vs Non-Responder) field. My question is how to aggregate all of this together so that I can perform a statistical test on the data. I have a clever way of reading in the 50 .maf files and combining them, but I wonder if this is an appropriate approach.
sample_info <- readr::read_tsv(file = "path/to/sample_info/sample-information.tsv")
maf_files <- fs::dir_ls("path/to/mafs/")
patients_data <- maf_files %>%
purrr::map_dfr(read_tsv, col_types = list(Chromosome = col_character()))
My thought then was to just dplyr::left_join()
the sample_info
with patients_data
like
patients_data_final <- patients_data %>% dplyr::left_join(sample_info, by = c("Tumor_Sample_Barcode", "Matched_Norm_Sample_Barcode"))
For clarity, here are the column names of both dataframes
> colnames(patients_data)
[1] "Hugo_Symbol" "Chromosome" "Start_position"
[4] "End_position" "Variant_Classification" "Variant_Type"
[7] "Reference_Allele" "Tumor_Seq_Allele1" "Tumor_Seq_Allele2"
[10] "Tumor_Sample_Barcode" "Matched_Norm_Sample_Barcode" "Protein_Change"
[13] "t_alt_count" "t_ref_count"
> colnames(sample_info)
[1] "Patient_ID" "Tumor_Sample_Barcode" "Matched_Norm_Sample_Barcode"
[4] "Response" "Silent_mutations_per_Mb" "Nonsynonymous_mutations_per_Mb"
[7] "Mutations_per_Mb"
My task is to find out "whether there are any specific mutations that are observed more in responders vs non-responders." So as a supplemental question, if anyone has suggestions on which statistical test to use (or how to go about deciding), I'd appreciate that as well.
PS: I am aware of the maftools package which probably has an easy solution to this, but unfortunately my machine is old (Late 2011 MacBook Pro) and unable to run it. (Old Mac --> Can't update OS --> Cant update version of R --> Can't install necessary packages to run maftools)