My favorite plant species has a 800 Mb genome and around 100 - 150 Mb transcriptom. It only has 15 genome assembly for 15 cultivars and several de novo RNA assembly. My cultivar has no existed genome assembly or transcriptom assembly. The published journal papers usually generates 50 Mb to 100 Mb reads per sample for de novo RNA assembly. I have a litmited budget.
So should I go shallow and generate 20 - 30 million reads per sample and do mapping using the existed genome assembly or transcriptom assembly of other cultivars? In this way, I can sequence more samples.
Should I go deep and generate above 50 Mb reads per sample an assemble my own transcriptom? In this way, I can only sequence fewer samples.
We are doing pilot experiments with limited budgets. My collaborator wants to have no replicate so that we can include more treatments. Is it a big no no? I personally prefer at least 3 biological replicates per each treatment.
Thank you so much for your kind help!
What is the question you want to answer?
just differential gene expression analysis
You need at least two replicates for DEG analysis. As a pilot probably fine if they aren't necessarily biological replicates.
Greater depth also helps with DEG analysis of lower expressed genes, but really depends on the specific questions you are trying to answer. For example, if you want reliable DEG analysis of genes with high and low expression, replicates with more depth is important. But, if you want to do a quick experiment and observe what are the highest expressed genes within each treatment (without acutal differential expression comparisons) then maybe lower depth and one replicate is okay for the pilot.
For reference transcriptome analysis, If you go shallower and sequence more treatments, then would you be able to merge reads from different treatments to generate a transcriptome that you could then align the individual samples to? This might be a middle ground?