Question: How to remove poly G in Nextseq data
1
gravatar for HG
4.0 years ago by
HG1.1k
Germany
HG1.1k wrote:

I am analyzing a Nextseq run from bacterial data set as like few post earlier here i also found  straight line of G's at the end of the read, can anyone suggest me how to overcome this problem?? any script or tool.

 

 

assembly trimming nextseq • 2.7k views
ADD COMMENTlink modified 4.0 years ago by Asaf5.2k • written 4.0 years ago by HG1.1k
0
gravatar for Asaf
4.0 years ago by
Asaf5.2k
Israel
Asaf5.2k wrote:

I use cutadapt with a poly-G as adapter, you should allow some errors because the poly-G sometimes combine an occasional A-C-T base.

When I analyze paired-end, the second mate is sometimes a poly-G and they I remove it by testing if the read has more than 80% G's. If that's the case I disregard the entire read (or use it as single-end, depends on what I do with it later).

ADD COMMENTlink written 4.0 years ago by Asaf5.2k

Thanks for your reply.  For second part of your comment: could you please suggest how i can do such a job any script ?? Because i have 4 paired-end reads for each sample.
 

ADD REPLYlink written 4.0 years ago by HG1.1k

What I do is run fastqc and then test if poly-G is one of the over-represented sequences (and for what extent).

Then,actually, cutadapt with poly-G as adapter will remove the read but you should give it both mates as input (I think it will remove both of them but I'm not sure)

ADD REPLYlink written 4.0 years ago by Asaf5.2k

I checked with FastQC as you suggested , in my data set there is no over represent sequence and mean quality score 35. So i hope without any processing the data set i can directly run assembly . What you think?? i used Spades for assembly which also have some error correction steps in ion-hammer. 

   

ADD REPLYlink written 4.0 years ago by HG1.1k

Sounds good, I can only dream of getting such numbers. Did you run both files (R1 and R2)?

ADD REPLYlink written 4.0 years ago by Asaf5.2k

Yes i did. I assembled also my data set with a good output N50 value number of contig  

ADD REPLYlink written 4.0 years ago by HG1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 691 users visited in the last hour