Hoping that some of you are familiar with Kallisto/Sleuth and might be able to help me out.
I am analyzing transcripts from a non-model species with triplicates of a control condition and two different treatments.
For annotation, I have used Diamond (none of the Ensembl genomes are close to my species) and now want to lift my analysis from transcript - to gene level.
The analysis of the transcripts alone resulted in 500 (differentially expressed transcripts (DETs) for the one treatment and 832 DETs for the other in comparison with the control.
But when I try to "lift" it to gene level, I all of a sudden get 3078 differentially expressed genes (DEGs) for the one treatment and 2153 DEGs for the second. This does not seem right since I expected repeated transcripts to "collapse" and result in lower numbers.
Do any of you know what I am doing wrong?
Best wishes, Birgitte
The code for DEG determination:
# The dataframe containing Trinity transcript IDs (target_id), acession_numbers (annotation_id), and gene names (gene) was being loaded into R #set base directory base_dir <- "~/Documents/RNASeq/Tools/kallisto/data/" # loading the meta-data s2c <- read.table(file.path(base_dir, "metadata.txt"), header = TRUE, stringsAsFactors=FALSE) # renaming the annotation df t2g <- dplyr::rename(AnnotationID, target_id = target_id, ens_gene = annotation_id, ext_gene = gene) # running the sleuth analysis so <- sleuth_prep(s2c, ~treat, target_mapping = t2g, aggregation_column = 'ens_gene') so <- sleuth_fit(so) so <- sleuth_fit(so, ~1, "reduced") so <- sleuth_lrt(so, 'reduced', 'full') results_table_lrt <- sleuth_results(so, 'reduced:full', test_type = 'lrt') so <- sleuth_wt(so, 'treatTreat1') so <- sleuth_wt(so, 'treatTreat2') results_table_wt_treat1 <- sleuth_results(so, 'treatTreat1') results_table_wt_treat2 <- sleuth_results(so, 'treatTreat2') Treat1 <- subset(results_table_wt_treat1, qval < 0.05) Treat2 <- subset(results_table_wt_treat2, qval < 0.05)