N base in RNA-seq data
0
0
Entering edit mode
4.4 years ago

Hi, I do an alignement of RNA-seq publish data with STAR aligner. I see in fastqc that for some sample, N content increase to 12% at base 4 and 22 (warning in orange). Should i cut N part ?

Thanks

RNA-Seq alignment • 836 views
ADD COMMENT
0
Entering edit mode

Looks like this data is not good, especially if you consistently have N's at specific cycles. Such data should not have been released by the sequencing facility. You can't remove N's since that will mess up reading frame.

You can filter out reads with N's using reformat.sh from BBMap suite by doing:

reformat.sh in=your_read.fq out=filtered.fq maxns=0

or

reformat.sh in1=your_read.fq in2=your_read.fq out1=filtered.fq out2=filtered.fq maxns=0

(if reads are paired-end).

ADD REPLY
0
Entering edit mode

Thanks you for tools suggestion, I will test it.

ADD REPLY

Login before adding your answer.

Traffic: 1822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6