I've just gotten a set of paired-end metagenomic sequencing data, performed on the Illumina HiSeq2000 platform (read length=100 bp, insert size=350 bp) and I have two questions about QC steps (trimming and decontamination) -- I have no experience on this kind of data and any help will be highly appreciated :)
Question #1: People in my lab suggested Trim Galore! as trimming software with default values, except for Phred score = 30 and overlap of at least 3bp with adapter sequence required to trim a sequence. Do these values make sense in a metagenomic context?
Question #2: I was thinking of using deconseq for the sequence decontamination, but, to the best of my understanding, the results are highly dependent on the chosen parameters, that are the percentage of alignment identity and coverage threshold, and three BWA-SW parameters: chunk size of reads, Z-best value, and alignment score threshold. How to select these in order to have the best results?
Thank you very much in advance!