How To Delete The All Fastq Reads Which Includes A Potential 50Bp Illumina Single End Pcr Primer 1
3
0
Entering edit mode
10.7 years ago
Tonyzeng ▴ 310

HI,

I did Fastqc and found that a potential 50bp illumina single End PCR primer 1 sequence in my reads as followings

AGTTGATCCGGTCCTAGGCAGTGTAGATCTCGGTGGTCGCCGTATCATTA (100% over 30bp)

I checked my reads and found that this 50bp sequence locates on 5' of my reads that account 0.25% of all reads. (also some of my reads that there are GCGCA/GCTCAG/AACCG/AACAAAAGG sequence before this 50bp sequence too))

Since my reads are all 88bp length. I do not want to keep these reads even if I cut these 50bp sequence off.

Anyone know if there is any tools that can delete the reads out when they have this 50bp sequence in the read? Or anyone has scripts or other ways to do this?

EDIT:

My aim for above question is that I want to get rid of these reads which contain AGTTGATCCGGTCCTAGGCAGTGTAGATCTCGGTGGTCGCCGTATCATTA sequence. since the reads contained this 50bp sequence only account for 0.25%. Fastq toolkit trimmer or other tools can not help.

I need to remove these reads which contain this 50bp sequence noisy from my library before I map them with BWA

trimming • 3.5k views
ADD COMMENT
0
Entering edit mode

please edit your original question rather than adding some 'answers'.

ADD REPLY
0
Entering edit mode
ADD REPLY
2
Entering edit mode
10.7 years ago

You could use Cutadapt or Trim Galore (a wrapper around CutAdapt that simplifies e.g. handling of paired end reads), and specify that reads below a certain length threshold after trimming should be discarded. You can input the sequence as an input option to the program.

ADD COMMENT

Login before adding your answer.

Traffic: 1417 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6