Hello,
I am trying to pre-process paired-end reads for immunoglobulin repertoire analysis.
I have R1 and R2 fastq files from 2x250bp MiSeq. I am not sure whether I am supposed to trim indexed adapters before merging, what to tool is best to use for merging, and how many sequences I should anticipate should successfully merge. I have already attempted trimming adapters and then joining, as well as directly joining, but in either case, less than 50% of my sequences form pairs.
Does anyone have a general outline of how to process paired-end reads? What steps should be done before using sequences for alignment to a reference database? How many sequences are typically lost at the merge step?
Thank you in advance.
This is the key sentence of your question for getting the right answer. What kind of alignment? If you want to use something like BWA or bowtie. Just quality trim and do the alignment will be sufficient. The tools can handle paired-end data.
Yes, that is my question. I need to merge the reads to assess the entire target gene region.
Again, depends on the type of alignment if you use BWA you don't need to merge. If you only want to blast, for now I would say use only the forward reads or a combination of merged reads and forward reads.