Question: What can I use to identify primer sequences in my fasta files?
9 months ago
c.older wrote:

I have NGS data of the V4 region of 16s rRNA but I am not aware of the primers used. I'm pretty sure the primer sequences are still in the files because the data is unprocessed, and I was able to remove the primers [of known sequences] for another data set from the same sequencer So wondering if there are tools that can help me figure out what these sequences are so that I can remove them appropriately? I was considering just trimming 20 bp from each end, but would prefer to remove the specific sequences. I have tried to align multiple sequences and see if I can find a common sequence, and I think I have found the reverse but am not confident in the forward..

modified 9 months ago by Friederike4.3k • written 9 months ago by c.older

I am not an expert but trim_galore documentation says that it predicts and removes adapters automatically and then does the fastqc. You need to have cutadapt and fastqc libraries to be able to run it.

written 9 months ago by piyushjo110

If you have paired-end reads and if enough reads have inserts shorter than read length then you can do this with BBTools: in1=r1.fq in2=r2.fq outa=adapters.fa

outa file contain adapter sequences.

written 9 months ago by genomax68k
9 months ago
United States
Friederike wrote:

Have you tried cutadapt? Sounds like it should be up the job.

written 9 months ago by Friederike4.3k
