Hi All,
I have RNAseq counts that I want to use for a differential expression analysis. However, there are approximately 1500 duplicated gene symbols with counts for each transcript (geneID). Can I simply collapse those to create unique rows for each geneSymbol?
For example,
counts <- counts %>%
  select(-c("geneID", "bioType", "annotationLevel")) %>%
  group_by(geneSymbol) %>%
  summarise(across(everything(), sum, na.rm = TRUE))
Thanks for your advice.