Question

When comparing gene expression in different species, should I assemble separate transcriptomes for each species?

1

Entering edit mode

2.9 years ago

olp123 ▴ 20

Dear all,

I want to compare gene expression of 5 different birds species. For each species, I have RNA-Seq reads from 5 to 10 individuals (single-end Illumina).

My question is what would be the best way to conduct DGE analysis?

Should I assemble one transcriptome with all the fastq-files from all species or one transcriptome for each of the species separately?

I am afraid that if I create one big transcriptome, its quality might suffer because not all of the species are very closey related.

Thank you very much,
Olaf

trinity differential-expression rna-seq • 904 views

ADD COMMENT • link updated 15 months ago by Ram 43k • written 2.9 years ago by olp123 ▴ 20

0

Entering edit mode

Hi Olaf

I am also working on cross-species transcriptomics for my PhD. How is it going , did you figure out the pipeline? I don't have genomes so first I made de novo assemblies using Trinity and determined the expression level of each transcript. Currently I am working on defining my single-copy orthologues between different species pairs using Orthofinder. I guess that should be the starting point.

Also, there is a new normalisation method called SCBN useful for this kind of analysis, I plan to test it on my data. https://pubmed.ncbi.nlm.nih.gov/30925894/

Best,
Lada

ADD REPLY • link updated 15 months ago by Ram 43k • written 15 months ago by Lada ▴ 30

score 1 · Answer 1 · 2021-06-15

You've got something of a dilemma here.

Almost all common DGE approaches rely on the assumption that any gene specific factors influencing counts are the same between samples. This is not some minor assumption that is nice to have fulfilled, but that tests are actaully robust to - it is at the heart of the DGE method.

If you assemble 5 different transcriptomes from the 5 species, then you will not be comparing like with like when you compare gene A from species 1 with gene A from species 2.

If you assemble a grand transcriptome from all the species together, best case is that highly divergent genes from different species assemble into separate contigs. But then you wouldn't be able to compare them.

My suggestions is that you assemble seperate transcriptomes, but then align them against each other and identify regions that are common to all genomes, and quantify against these.