I am a beginner in this field of Transcriptomics and I really need your help with RSEM for differential gene expression. I have sequenced my samples by using SOLiD 5500 platform. I used SATRAP denovo assembler for the assembly of color-space FASTA reads and generated contigs in FASTA format. My questions regarding the RSEM is: The contigs are from single-end / fragment library and are of variable lengths, I do not know which numbers should I assign to the important fragment length distribution parameters (--fragment-length-mean and --fragment-length-sd)? The CD-HIT-EST stats file for redundant contigs has the following sequence length distributions: Sequence type DNA No. sequences 12876 Longest sequence 932 Shortest sequence 100 Average length 334 Total letters 4305560 Total N letters 6844 Total non N 4298716 Sequences with N 583
I will be greatly obliged for your advice and humble support. Thank you.