Hello everyone, I'm a Master student from Spain and this is my very first job with bioinformatics and RNA-Seq, so please excuse me if my questions are too easy or are not very clearly explained.
I have two files containing 100pb pair-end reads from Illumina RNA-seq and I want to assembly them into a De Novo transcriptome using Trinity. Up to here everything is OK, but I have some doubts about the process of combining the two files (containing the F and the R reads) to obtain a “consensus” sequence for the k-mer dictionary construction and the downstream processes (actually I don’t really know if such a “consensus” sequence is formed or not when you perform the Inchworm algorithm of the assembly).
My main doubt is if the F and the R reads of the 100pd fragment need to be of the same length. I wonder this because the first 9-10 bases of each read have poor per sequence position quality, and if I trim them I don’t know if it’s going to be a disaster (because I don’t know if Trinity align the F and R reads or if it just transforms the R reads to their reverse complementary and obtains the k-mers from the F and the R-transformed reads independently).
I know it’s a bit messy but I will be very grateful if anyone can help me.