Question: Fastq_quality_filter: found invalid nucleotide sequence
0
gravatar for Sureshkumar V.
14 months ago by
Sureshkumar V.0 wrote:

I am trying to filter reads(Illumina data - RNA-Seq) based on the quality of 30 using the fastx toolkit. But I got an error like this. fastq_quality_filter: found invalid nucleotide sequence (GCGGAGWAACCGTTCGGCEACCAGGTGGCATCGCCGCCGAGGGWGCTCCCGTGGCGCGGGCAGTCGTTGACGAACATCTC) on line 85766.

How to resolve this error?

Thanks in advance.

rna-seq ngs assembly • 591 views
ADD COMMENTlink modified 14 months ago by Sej Modha4.2k • written 14 months ago by Sureshkumar V.0

Sequence contains 'W' (and maybe other) character(s) that might be causing the error. You can try other tools such as fastqc.

ADD REPLYlink modified 14 months ago • written 14 months ago by Sej Modha4.2k
1

'W' should be an allowed character, since it encodes for the weak bases (A or T). I've never seen an 'E'; this might be the problem.

There are many recent alternatives to the quite old fastx tool-kit. Just to name a few: bbduk, trimgalore, or trimmomatic.

ADD REPLYlink written 14 months ago by michael.ante3.3k

How did you get ambiguous codes in your raw RNAseq data? What technology is this data from and has it been pre-processed in some fashion?

ADD REPLYlink modified 14 months ago • written 14 months ago by genomax67k
0
gravatar for egeulgen
14 months ago by
egeulgen730
Istanbul
egeulgen730 wrote:

Your sequence seems to contain ambiguity codes. Simply remove those and you should be fine

ADD COMMENTlink written 14 months ago by egeulgen730

You can have a look at the ambiguity codes here

ADD REPLYlink written 14 months ago by Nandini800

Ok, egeulgen, I will try and let you know.

ADD REPLYlink written 14 months ago by Sureshkumar V.0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1442 users visited in the last hour