Question: Merging de-novo transcriptome assembly from multiple datasets
I have a new dataset of Illumina RNA-sequencing from an organism without a known genome. I have two more older datasets from the same organism (different protocol and different machine). I want to build a de-novo transcriptome assembly based on all the data (I plan to use Trinity). The question is whether to pool all the data together or run the assembly independently for each dataset and then combine the results in some way.

Does anyone have experience with this?



What type of data do you have? Do you plan to do DGE? Trinity recommends building a single assembly with combined data. You can specify replicates/samples downstream for DGE analysis.

I have paired end illuminia sequencing. The two datasets are from the same organism, although not necessarily from the same isolate. There are differences in the length of the reads 125 or 100) and the Illumina machine model

