Question: Kallisto/Sleuth - issue when using own annotations in the sleuth_prep() function
gravatar for BioBing
3.4 years ago by
BioBing130 wrote:

Hi all,

Hoping that some of you are familiar with Kallisto/Sleuth and might be able to help me out.

I am analyzing transcripts from a non-model species with triplicates of a control condition and two different treatments.

For annotation, I have used Diamond (none of the Ensembl genomes are close to my species) and now want to lift my analysis from transcript - to gene level.

The analysis of the transcripts alone resulted in 500 (differentially expressed transcripts (DETs) for the one treatment and 832 DETs for the other in comparison with the control.

But when I try to "lift" it to gene level, I all of a sudden get 3078 differentially expressed genes (DEGs) for the one treatment and 2153 DEGs for the second. This does not seem right since I expected repeated transcripts to "collapse" and result in lower numbers.

Do any of you know what I am doing wrong?

Thank you!

Best wishes, Birgitte

The code for DEG determination:

# The dataframe containing Trinity transcript IDs (target_id), acession_numbers (annotation_id), and gene names (gene) was being loaded into R

#set base directory
base_dir <- "~/Documents/RNASeq/Tools/kallisto/data/"

# loading the meta-data
s2c <- read.table(file.path(base_dir, "metadata.txt"), header = TRUE, stringsAsFactors=FALSE)

# renaming the annotation df
t2g <- dplyr::rename(AnnotationID,
                     target_id = target_id,
                     ens_gene = annotation_id,
                     ext_gene = gene)

# running the sleuth analysis
so <- sleuth_prep(s2c, ~treat, target_mapping = t2g, aggregation_column = 'ens_gene')
so <- sleuth_fit(so)
so <- sleuth_fit(so, ~1, "reduced")
so <- sleuth_lrt(so, 'reduced', 'full')

results_table_lrt <- sleuth_results(so, 'reduced:full', test_type = 'lrt')

so <- sleuth_wt(so, 'treatTreat1')
so <- sleuth_wt(so, 'treatTreat2')
results_table_wt_treat1 <- sleuth_results(so, 'treatTreat1')
results_table_wt_treat2 <- sleuth_results(so, 'treatTreat2')

Treat1 <- subset(results_table_wt_treat1, qval < 0.05)
Treat2 <- subset(results_table_wt_treat2, qval < 0.05)
rna-seq kallisto sleuth R • 1.7k views
ADD COMMENTlink written 3.4 years ago by BioBing130

This does not seem right since I expected repeated transcripts to "collapse" and result in lower numbers.

I don't think this is necessarily the case. What also happens by collapsing is boosting the read count for the genes (the sum of the transcripts) and therefore increasing also the power to detect differential expression. It's perfectly possible that for lowly expressed transcripts the counts are insufficient, but when collapsing to the gene level you reach significance.

See also Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences

ADD REPLYlink written 3.3 years ago by WouterDeCoster44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1017 users visited in the last hour