Question: Python Script to map reads to reference sequence
0
gravatar for Fido
4 weeks ago by
Fido10
Fido10 wrote:

I have sequence reads 1, 2 and 3. I want to map them to a reference sequence L2_cat.fasta. I must use a python script to iteratively map those reads to the reference sequence. I have used programs to map sequences before but am not conversant with writing python script though I have basic knowledge about python programming.

ADD COMMENTlink written 4 weeks ago by Fido10
1

Is this an assignment? Why must you use python? There's really no reason to add a layer of complexity with a python wrapper over normal CLI programs for something like this.

ADD REPLYlink written 4 weeks ago by jrj.healey12k

It was given as a tool to make me research. I am not in a formal school, I am learning with the intent to join Bioinformatics. I am a Wet Lab Technician.

ADD REPLYlink written 4 weeks ago by Fido10
1

You mean you were given this task to solve, and told to use python to do it, so you could learn some python?

True, you could stand to learn something about python at a very basic level by doing this, but essentially you'd just be leaning on the subprocess module. It is adding a 'middle man' layer of complexity that the task completely doesn't need.

There are many other jobs you'll encounter which are solved properly with something like python, but this really isnt one of them.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by jrj.healey12k
1

Looks like there are several hoops you would need to jump through to do this exercise.

Are you expected to use python to do the actual alignment or are you going to use python as a simple wrapper for an external proper alignment program. What kind of sequences are you working with and what is in L2_cat.fasta?

As @jrj.healey noted above unless you are going to do this a simple string matching exercise then it would be one thing but anything else would be a complex task.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by genomax67k

@genomax, I am supposed to use the code to do the actual mapping and not as a wrapper. In simple terms, am given three sequences which I am asked to map to a reference sequence that L2_cat.fasta (this is the name of the file that contains the reference sequence). I tried to find out from Python programming books on how I can do this, but I must admit I found it extremely difficult so I thought at last I could ask from experts who have worked with these things for a long time how to do it.

ADD REPLYlink written 4 weeks ago by Fido10

I see. Are they expecting you to code the algorithm from scratch, or just use native python functions to do it?

How long are the sequences you've been given?

ADD REPLYlink written 4 weeks ago by jrj.healey12k

@jrj.healey use native python functions. The reads/sequences are a million bases.

ADD REPLYlink written 4 weeks ago by Fido10
1

I really dont know how well python is going to cope with trying to align a million characters. The closest pre-existing python method will be Pairwise2, but I don't fancy its chances with strings that long.

ADD REPLYlink written 4 weeks ago by jrj.healey12k
1

This problem is not something one would assign a novice. Sounds rather odd.

ADD REPLYlink written 4 weeks ago by genomax67k
1

Like genomax said, this sort of task makes no sense. If someone truly wanted you to learn bioinformatics, they would not ask you to solve a complex, high-volume-processing problem that has already been solved dozens of times using a tool that is the opposite of what you should be using (unless they mean for you to use Cython in which case they're being ridiculous). Honestly, ask them why they want you to reinvent the wheel and unless they have very good reason, find someone else to teach you.

ADD REPLYlink written 4 weeks ago by RamRS21k
1

Why you don't use BowTie2 or similar tools to make these?

"3: Only write code if you absolutely have to" https://simpleprogrammer.com/11-rules-all-programmers-should-live-by/

ADD REPLYlink written 4 weeks ago by flogin80
1

@flogin Am asked to use code and not any program. Am an informal student who wants to join bioinformatics. I work in a Wet Lab so the technician who is helping me with coding asked me that.

ADD REPLYlink written 4 weeks ago by Fido10
1

Ah ok, so.. good luck !

ADD REPLYlink written 4 weeks ago by flogin80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1144 users visited in the last hour