Question

What can I use to identify primer sequences in my fasta files?

0

Entering edit mode

6.8 years ago

c.older • 0

I have NGS data of the V4 region of 16s rRNA but I am not aware of the primers used. I'm pretty sure the primer sequences are still in the files because the data is unprocessed, and I was able to remove the primers [of known sequences] for another data set from the same sequencer So wondering if there are tools that can help me figure out what these sequences are so that I can remove them appropriately? I was considering just trimming 20 bp from each end, but would prefer to remove the specific sequences. I have tried to align multiple sequences and see if I can find a common sequence, and I think I have found the reverse but am not confident in the forward..

next-gen sequencing sequence • 4.1k views

ADD COMMENT • link updated 6.8 years ago by Friederike 9.0k • written 6.8 years ago by c.older • 0

0

Entering edit mode

I am not an expert but trim_galore documentation says that it predicts and removes adapters automatically and then does the fastqc. You need to have cutadapt and fastqc libraries to be able to run it. https://github.com/FelixKrueger/TrimGalore/blob/master/Docs/Trim_Galore_User_Guide.md

ADD REPLY • link 6.8 years ago by piyushjo ▴ 710

0

Entering edit mode

If you have paired-end reads and if enough reads have inserts shorter than read length then you can do this with BBTools:

bbmerge.sh in1=r1.fq in2=r2.fq outa=adapters.fa

outa file contain adapter sequences.

ADD REPLY • link 6.8 years ago by GenoMax 152k

score 0 · Answer 1 · 2018-09-04

0

Entering edit mode

6.8 years ago

Friederike 9.0k

Have you tried cutadapt? Sounds like it should be up the job.

ADD COMMENT • link 6.8 years ago by Friederike 9.0k