Question: Fastq_quality_filter: found invalid nucleotide sequence
0
gravatar for Sureshkumar V.
2.4 years ago by
Sureshkumar V.0 wrote:

I am trying to filter reads(Illumina data - RNA-Seq) based on the quality of 30 using the fastx toolkit. But I got an error like this. fastq_quality_filter: found invalid nucleotide sequence (GCGGAGWAACCGTTCGGCEACCAGGTGGCATCGCCGCCGAGGGWGCTCCCGTGGCGCGGGCAGTCGTTGACGAACATCTC) on line 85766.

How to resolve this error?

Thanks in advance.

rna-seq ngs assembly • 1.0k views
ADD COMMENTlink modified 2.4 years ago by Sej Modha4.7k • written 2.4 years ago by Sureshkumar V.0

Sequence contains 'W' (and maybe other) character(s) that might be causing the error. You can try other tools such as fastqc.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Sej Modha4.7k
1

'W' should be an allowed character, since it encodes for the weak bases (A or T). I've never seen an 'E'; this might be the problem.

There are many recent alternatives to the quite old fastx tool-kit. Just to name a few: bbduk, trimgalore, or trimmomatic.

ADD REPLYlink written 2.4 years ago by michael.ante3.6k

How did you get ambiguous codes in your raw RNAseq data? What technology is this data from and has it been pre-processed in some fashion?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by genomax87k
0
gravatar for egeulgen
2.4 years ago by
egeulgen980
Istanbul
egeulgen980 wrote:

Your sequence seems to contain ambiguity codes. Simply remove those and you should be fine

ADD COMMENTlink written 2.4 years ago by egeulgen980

You can have a look at the ambiguity codes here

ADD REPLYlink written 2.4 years ago by Nandini840

Ok, egeulgen, I will try and let you know.

ADD REPLYlink written 2.4 years ago by Sureshkumar V.0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 724 users visited in the last hour