Hi Everyone, I've been digging around the web trying to find a tool that would allow me to clean-up my paired-end Illumina data before mapping. My pipeline thus far has been to:
1) FASTQC - my R1 file had a bit of adaptor contamination, the R2 file was fine. 2) fastx_collapser - I had a lot of data and am just mapping to determine coverage of the genome (of closely related species) to see how broad our coverage is before other analysis begins - ran on R1 and R2 seperately (files were left with a different number of sequences although it was <1% of the total number of sequences) 3) fastx_clipper - only on the file with the adaptor contamination - removed sequences containing the adaptor 4) fix pairing data - ? tool
I saw there was some tool referred to as rePair, but I have not been able to track it down. I thought for sure that fastx or picard would have something to filter out unpaired reads, but I'm just not seeing it. I'm hoping there is any easy answer here. I am planning to use bowtie2 for the alignment. Thanks in advance!