Question: Fastq_quality_filter: found invalid nucleotide sequence
0
gravatar for Sureshkumar V.
20 months ago by
Sureshkumar V.0 wrote:

I am trying to filter reads(Illumina data - RNA-Seq) based on the quality of 30 using the fastx toolkit. But I got an error like this. fastq_quality_filter: found invalid nucleotide sequence (GCGGAGWAACCGTTCGGCEACCAGGTGGCATCGCCGCCGAGGGWGCTCCCGTGGCGCGGGCAGTCGTTGACGAACATCTC) on line 85766.

How to resolve this error?

Thanks in advance.

rna-seq ngs assembly • 807 views
ADD COMMENTlink modified 20 months ago by Sej Modha4.5k • written 20 months ago by Sureshkumar V.0

Sequence contains 'W' (and maybe other) character(s) that might be causing the error. You can try other tools such as fastqc.

ADD REPLYlink modified 20 months ago • written 20 months ago by Sej Modha4.5k
1

'W' should be an allowed character, since it encodes for the weak bases (A or T). I've never seen an 'E'; this might be the problem.

There are many recent alternatives to the quite old fastx tool-kit. Just to name a few: bbduk, trimgalore, or trimmomatic.

ADD REPLYlink written 20 months ago by michael.ante3.5k

How did you get ambiguous codes in your raw RNAseq data? What technology is this data from and has it been pre-processed in some fashion?

ADD REPLYlink modified 20 months ago • written 20 months ago by genomax74k
0
gravatar for egeulgen
20 months ago by
egeulgen960
Istanbul
egeulgen960 wrote:

Your sequence seems to contain ambiguity codes. Simply remove those and you should be fine

ADD COMMENTlink written 20 months ago by egeulgen960

You can have a look at the ambiguity codes here

ADD REPLYlink written 20 months ago by Nandini830

Ok, egeulgen, I will try and let you know.

ADD REPLYlink written 20 months ago by Sureshkumar V.0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1893 users visited in the last hour