Clip Adapters In Sanger Sequencing Traces
0
0
Entering edit mode
9.0 years ago
Alice ▴ 300

Hello biostars! I downloaded fasta files from http://www.ncbi.nlm.nih.gov/Traces/trace.cgi (mouse genome traces) There are files with 'clip'-prefix, i'm not sure, but is it primers\adapters? Can't find any documentation about this files. So, I want to make clipping and trim my traces according to coordinates from 'clip'-files. After googling, i didn't find any tool for that. All tools are for trimming NGS data. My question: is there any tool for clipping or I need to write my own script? I'm newbie in programming (beginner in python) and have absolutely no idea how to write such a script.

Summary: I have 'clip' file, which looks like

TI    CLIP_LEFT    CLIP_RIGHT
1101188317    0    576
1101188318    19    734
1101188319    6    742
1101188320    16    809


And 'trace' file, which looks like simple fasta

>1101188317
ATGCAT...all reads are ~1660 b.p. long.
>1101188318
...
>1101188319
...
>1101188320
...


Problem is following:

• i don't understand numbers in clip file (f.ex. "clip right" is the right coordinate of what?)
• it's not clear for me what does it mean 'clip'.
• if numbers are something like coordinates of adapters i need to make trimming (trim sequences in fasta file)
• All tools are for NGS data, but this datasets are from sanger sequensing, so i don't know the adapter sequence, i know just coordinates (if this numbers are coordinates)
fasta • 4.2k views
0
Entering edit mode

I am not sure about your first question. For the second question there are many tools available to perform clipping of the adapters. Use the search box in Biostar and search for 'clip adapters' or 'remove adapters'. You will find lot of informative posts.

0
Entering edit mode

Is NGS clipping different to clipping sanger sequencing traces? I think that you can use the same tools like cutadapt or fastx to clip your traces.

0
Entering edit mode

I edited my question. As i understand NGS clipping, tools need adapter sequences. But in the case of traces i have only coordinates of them (if this numbers are coordinates)

1
Entering edit mode

You can use substring function in python.

1) Read each fasta sequence in a string

2) Extract the desired string. If x stores the string then x[y:z] should give the sequence. y and z are left and right coordinates that you already have in the other file.

0
Entering edit mode

Thank you! I will try It.