Question

fastx issue with reverse_complement , change in overepresented sequences

0

Entering edit mode

8.6 years ago

gufernandez10 ▴ 10

Hi I'm working with fastx to get the reverse_complement from fastq file downloaded from sra and separated by sra-toolkit. My problem is after using fastx reverse_complement, the over-represented sequences identified by fastqc change. I would expect the number of reads over-represented in the two file were the same and the sequence in the second file were the reverse_complement. The command used was:

fastx_reverse_complement -Q 33 -i FILE2.fastq -o FILE2_rev_com.fastq

First two examples of overrepresented sequence in original file detected by fastqc:

AGGCTAGTTTGTTAGTGGCGTGTCCGTCCGCAGCTGGCAAGCGAATGTAA    143240    1.7818072306991304
GGCTAGTTTGTTAGTGGCGTGTCCGTCCGCAGCTGGCAAGCGAATGTAAA    76434     0.9507864693609142

First two examples of over-represented sequences in after reverse_complement over original file:

CTCGGTACTACATGCTTAGTCAGTCTTTACATTCGCTTGCCAGCTGCGGA    136155    1.6936746962848372
CCTCGGTACTACATGCTTAGTCAGTCTTTACATTCGCTTGCCAGCTGCGG    70491     0.8768596306842531

Any idea what is happening here or what I'm doing wrong?

RNA-Seq fastx fastq • 1.5k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by gufernandez10 ▴ 10