Hi Good people,

For a good de-novo transcriptome assembly, how much paired-end reads I would need? FYI, genome size of our organism is 800 mb.

Thank you so much for your input.

Best of luck.

@Mensur has given you an answer about the total amount of sequencing you can expect to do. What would be important is to make libraries from different life cycles/organs etc so you try to get comprehensive coverage of expressed complement for your organism.

I don't think anyone can give a reliable answer based on the information you provided. That's because the answer depends on whether you already have a genome/proteome available. Depends on your final research goals. Depends on the coding density.

But let's say that a coding density is 10% (80 Mb) and you expect half a total coding sequence to be transcribed at any given time (40 Mb). If you want 100x coverage and your reads are 150 bp long, you would need:

40,000,000 * (100 / 150) = ~ 26.7 million reads

Oops, just realized that you addressed this to Good people. Maybe I shouldn't have answered :D