I'm using Trinity to assemble RNAseq data (2X76bp) from an non-model species without available genome. I'm studying tumor samples. For this, I've three control samples (healthy cells) and 9 tumor samples. To perform differential expression analysis (with DESeq or edgeR per example) after the assembly, I've to co-assemble all the samples at once (so concatenate all the fastq files in one big fastq file and perform trinity on it)
Is that right ? Even if the tumor cell can have a totally different transcriptome (genetic modification, etc...)
Thanks a lot for your advices.
When you combine samples, do you suggest limiting the number of reads for each sample? I had read one of Matthew MacManes' papers awhile back suggesting 40 million reads was a good number for Trinity assemblies of most metazoans.
However, I have 28 samples across 3 treatments that I want to use for DESeq and so, if I follow the advice above, that's 28x40 million reads = 1.1 billion reads going into a Trinity assembler... how is it suggested to combine these samples in situations like this?