Hi, I'm currently investigating fastq file from WGS. Sometimes we have low quality read segment, that need to be trimmed to pass FastQC, however I'm afraid that shorter read can lead to higher chance of mismapping, but I don't know how to quantify that. Could you please address me how to calculate the accuracy decrease while performing read trimming?
For example: My sample has 600M read, 150bp average. I got 10% of read that got low quality segment, average length of bad segment is 50bp. If I trim the bad segment, how much data information I will lost? (compare with ideal sequencing with no error)
Thank you very much