7.9 years ago by
Washington University, St Louis, USA
This depends. I would guess that there are many people taking RNAseq fastq files directly from CASAVA and feeding into downstream analysis (e.g., tophat/cufflinks). Some "grooming" is done automatically in the sense that garbage reads are less likely to align and the aligner may use base quality in its determination of what is an acceptable alignment. Some people may add a duplicate removal step to eliminate duplicate reads. You can also choose your own arbitrary cutoff for average phred score of reads that you want in your downstream analysis. You should probably search this forum before asking this question. It has been addressed here (almost a duplicate question) and here. The issue of duplicate removal has also been addressed somewhat in the forum (search for duplicates) and a useful discussion on that topic can be found here and here (especially follow the link to seqanswers).