Question: Producing the Reverse-complement of each sequence in fastq files
2
gravatar for nakanomasayuki265
2.2 years ago by
nakanomasayuki26560 wrote:

I want to produce the reverse-complement of each sequence in fastq files. I tried fastx_reverse_complement in FASTX-Toolkit. But following an error message were obtained. fastx_reverse_complement: Invalid quality score value (char '#' ord 35 quality value -29) on line 4

Are there any problems ? Are there any software producing the reverse-complement of each sequence in fastq files?

sequence • 2.5k views
ADD COMMENTlink modified 2.2 years ago by theobroma221.1k • written 2.2 years ago by nakanomasayuki26560

I think you can write your script to get the reverse complement using biopython.

ADD REPLYlink written 2.2 years ago by bharata1803420

Maybe your version of FASTX-Toolkit is too old ?

$ cat toto.fq 
@test
AAACCTGG
+
III#IIEE

$ fastx_reverse_complement -i toto.fq 
@test
CCAGGTTT
+
EEII#III

$ fastx_reverse_complement -h
usage: fastx_reverse_complement [-h] [-r] [-z] [-v] [-i INFILE] [-o OUTFILE]
Part of FASTX Toolkit 0.0.14 by A. Gordon (assafgordon@gmail.com)

Edit: link to the commit that deprecated -Q33.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Charles Plessy2.7k

See bowtie "Saw ASCII character -54 but expected 33-based Phred qual." after -Q 33 fastx reverse complement for a possible solution.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Michael Dondrup46k

Can you explain to me why you are after the reverse complement of the FASTQ sequences please? Thanks.

ADD REPLYlink written 2.2 years ago by theobroma221.1k
4
gravatar for shenwei356
2.2 years ago by
shenwei3564.6k
China
shenwei3564.6k wrote:

You may try seqkit (v0.4.5 or later, run seqkit version to check version), which provides executable binary files for Linux/Windows/OS X. Just download, decompress and immediately use.

$ seqkit seq t.fq.gz 
@K00137:236:H7NLVBBXX:6:1126:29721:23241 1:N:0
TGGTAGGGAGTTGAGTAGCATGGGTATAGTATAGTGTCATGATGCCAGATTTTAAAAAAAATACTGGAGA
+
```eeiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

$ seqkit seq -r -p  t.fq.gz 
@K00137:236:H7NLVBBXX:6:1126:29721:23241 1:N:0
TCTCCAGTATTTTTTTTAAAATCTGGCATCATGACACTATACTATACCCATGCTACTCAACTCCCTACCA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiee```

All in one:

seqkit seq -r -p t.fq.gz | gzip -c  > new.fq.gz  # faster

or

seqkit seq -r -p t.fq.gz -o new.fq.gz

But, it seems nobody reverses complement FASTQ sequences.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by shenwei3564.6k

Best solution, and it worked.

ADD REPLYlink written 4 weeks ago by SmallChess480
0
gravatar for theobroma22
2.2 years ago by
theobroma221.1k
theobroma221.1k wrote:

The error says you have an invalid quality score so perhaps you can't have negative quality score values such as -29 in the example you provided. Hope this helps.

ADD COMMENTlink written 2.2 years ago by theobroma221.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1102 users visited in the last hour