Python Script to map reads to reference sequence
0
0
Entering edit mode
3.4 years ago
Fid_o ▴ 40

I have sequence reads 1, 2 and 3. I want to map them to a reference sequence L2_cat.fasta. I must use a python script to iteratively map those reads to the reference sequence. I have used programs to map sequences before but am not conversant with writing python script though I have basic knowledge about python programming.

sequence python mapping script reference sequence • 2.4k views
1
Entering edit mode

Is this an assignment? Why must you use python? There's really no reason to add a layer of complexity with a python wrapper over normal CLI programs for something like this.

0
Entering edit mode

It was given as a tool to make me research. I am not in a formal school, I am learning with the intent to join Bioinformatics. I am a Wet Lab Technician.

1
Entering edit mode

You mean you were given this task to solve, and told to use python to do it, so you could learn some python?

True, you could stand to learn something about python at a very basic level by doing this, but essentially you'd just be leaning on the subprocess module. It is adding a 'middle man' layer of complexity that the task completely doesn't need.

There are many other jobs you'll encounter which are solved properly with something like python, but this really isnt one of them.

1
Entering edit mode

Looks like there are several hoops you would need to jump through to do this exercise.

Are you expected to use python to do the actual alignment or are you going to use python as a simple wrapper for an external proper alignment program. What kind of sequences are you working with and what is in L2_cat.fasta?

As @jrj.healey noted above unless you are going to do this a simple string matching exercise then it would be one thing but anything else would be a complex task.

0
Entering edit mode

@genomax, I am supposed to use the code to do the actual mapping and not as a wrapper. In simple terms, am given three sequences which I am asked to map to a reference sequence that L2_cat.fasta (this is the name of the file that contains the reference sequence). I tried to find out from Python programming books on how I can do this, but I must admit I found it extremely difficult so I thought at last I could ask from experts who have worked with these things for a long time how to do it.

0
Entering edit mode

I see. Are they expecting you to code the algorithm from scratch, or just use native python functions to do it?

How long are the sequences you've been given?

0
Entering edit mode

@jrj.healey use native python functions. The reads/sequences are a million bases.

1
Entering edit mode

I really dont know how well python is going to cope with trying to align a million characters. The closest pre-existing python method will be Pairwise2, but I don't fancy its chances with strings that long.

1
Entering edit mode

This problem is not something one would assign a novice. Sounds rather odd.

1
Entering edit mode

Like genomax said, this sort of task makes no sense. If someone truly wanted you to learn bioinformatics, they would not ask you to solve a complex, high-volume-processing problem that has already been solved dozens of times using a tool that is the opposite of what you should be using (unless they mean for you to use Cython in which case they're being ridiculous). Honestly, ask them why they want you to reinvent the wheel and unless they have very good reason, find someone else to teach you.

1
Entering edit mode

Why you don't use BowTie2 or similar tools to make these?

"3: Only write code if you absolutely have to" https://simpleprogrammer.com/11-rules-all-programmers-should-live-by/

1
Entering edit mode

@flogin Am asked to use code and not any program. Am an informal student who wants to join bioinformatics. I work in a Wet Lab so the technician who is helping me with coding asked me that.

1
Entering edit mode

Ah ok, so.. good luck !

Traffic: 676 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.