gene lists and fasta files
1
0
Entering edit mode
9.0 years ago
kxd419 ▴ 10

Hello,

I have been trying to learn python to help with some tasks:

I have a list of gene names in one file and in the other file a large fasta sequences.

I need to be able to pull out the gene name and the sequence from the fasta file if it has the same name that is in the gene name list.

Can anyone help me with this code?

Kind regards

gene blast sequencing genome alignment • 2.6k views
ADD COMMENT
1
Entering edit mode

Can you show us what have you tried so far?

ADD REPLY
0
Entering edit mode
9.0 years ago
Ram 43k

Use regular File IO for the file with the names. Use Bio.Seq.IO for the FASTA file. Read all names into an array. Compare (either check for equality or use regular expressions) an appropriate attribute of the record read from the FASTA file to the set of names.If a match is found, print the record read from the FASTA file and move on to the next FASTA record. If you've scanned the entire set of names and you did not find a match, move on to the next record.

The approach above is vague on purpose. Also, you can optimize the approach in multiple ways, but this is one of the more basic approaches.

ADD COMMENT

Login before adding your answer.

Traffic: 3133 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6