I have several doubts related with DGE. I have searched in different forums so I expect to don't repeat a question answered several times (otherwise, let's me say sorry :P).
I have assembled a de novo transcriptome (invertebrate) composed by 120,000 (I have started with more than 1,000,000 but I have used EvidentialGene to compact my transcriptome) transcripts/contigs/unigenes (I'm a little bit confused with these terms equivalence). When I plan to go in to a DGE analysis I think about the transcript redundancy. Several transcripts of my transcriptome come in fact from the same gene. So when I run the DESeq2 pipeline directly with the Trinity scripts I could obtain a lot of "false positives/negatives" because reads which corresponds to the same gene are divided between different transcripts.
In order to solve this issue I have used tximport pipeline to introduce my count matrix in to DESeq2 (sorry I forget to mention it, I have used kallisto to obtain my "read counts"). With the use of tximport I was looking for 1) make my counts fit in the model assumptions and 2) summarize at gene level my transcripts. To reach that gene level I have annotate my transcripts in order to generate the file tx2gene. Then I just followed the DESeq2 pipeline.
As summary: Trinity (EvidentialGene) > kallisto > blastx vs custom database (9 proteomes) > generate tx2gene file > tximport > DESeq2
In my blast search I have identified 40,000 of 120,000 transcripts, which corresponds to 19,500 different genes. I have "only" identified 30% of my transcripts but I'm not sure what is better: Use all transcripts in the DGE but dividing reads between several trasncripts (which could be the same gene)??, or focus the DGE in the partition of trasncripts which I already know "who" they are??
Please I want to know if that approach it's viable or not. I have seen a lot of examples of tximport & DESeq2 but these examples are ever based in a reference genome approach.
Than you for your time and your attention.
Edit: if you need more details I could give it to you.