I am analyzing a Nextseq run from bacterial data set as like few post earlier here i also found straight line of G's at the end of the read, can anyone suggest me how to overcome this problem?? any script or tool.
I use cutadapt with a poly-G as adapter, you should allow some errors because the poly-G sometimes combine an occasional A-C-T base.
When I analyze paired-end, the second mate is sometimes a poly-G and they I remove it by testing if the read has more than 80% G's. If that's the case I disregard the entire read (or use it as single-end, depends on what I do with it later).