Problem with N sequences in fastqc file
1
0
Entering edit mode
4.5 years ago
carina2817 ▴ 20

Hello,

I am trying to filter a fastq file, I ran fastqc to get a quality report and I get an overrepresented sequence:

sequence: NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN count: 39317 percentage: 0.13862182817994162

The fastq file has 28362777 sequences and the read length is 125.

I used cutadapt (fastx toolkit) to remove it:

gunzip -c SRR9667734_S_sp_2.fastq.gz |  cutadapt -m 20 -e 0.1 -z -a NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN - -o SRR9667734_S_sp_cutadapt_2.fastq.gz

but the resulting file still has those overrepresented sequences and the number of sequences in the fastq file was reduced to 68122 after running cutadapt.

Overrepresented sequences:

sequence: NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN count: 39317 percentage: 57.71556912597986

sequence: ANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN count: 1172 percentage: 1.7204427350929214

sequence: GNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN count: 1014 percentage: 1.488505915856845

sequence: CNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN count: 895 percentage: 1.3138193241537244

sequence: TNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN count: 864 percentage: 1.268312733037785

Any idea of what's happening?

fastqc fastq • 1.7k views
ADD COMMENT
0
Entering edit mode

Not answering your question but you can try bbduk.sh from BBMap suite with maxns=-1 If non-negative, reads with more Ns than this (after trimming) will be discarded option to remove reads with N's.

ADD REPLY
0
Entering edit mode

For starters, maybe put the -o option before the input. And I'm pretty sure cutadapt can handle gzipped files, so no need to decompress.

ADD REPLY
0
Entering edit mode
4.5 years ago
Jianyu ▴ 580

See the documentation about wildcard interpretation in cutadapt: https://cutadapt.readthedocs.io/en/stable/guide.html#wildcards

The right way to remove N in fastq: https://cutadapt.readthedocs.io/en/stable/guide.html#dealing-with-n-bases

ADD COMMENT

Login before adding your answer.

Traffic: 2477 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6