reading a multi FASTA slicing sequences and outputing several different files
1
0
Entering edit mode
8.0 years ago
djd17 • 0

Hello I am running a simple script with bio python that slices up a nucleotide sequence into chunks of 40

x=range(100,500, 100)

sequence='AAAGGG...some sequence'

export_file=open('_SLQuery.fasta', 'w')

for i in x:
    name='query%d' % i
    query=sequence[i:i+40]
    export_file.write('>'+str(name)+'\n'+str(query)+'\n\n')

export_file.close()

Currently I have to manually enter the sequence from a multi fasta file (500+ records) 1 by 1 and specify a new output file name each time. Any ideas how I could get this script to work by importing the the whole fasta file at once and exporting each query to a different output file(23_SLQuery.fasta, 74_SLQuery.fasta etc....where 23,74 are the record id's). I have tried SeqIO.parse but it still only calls one sequence. I could not figure it out using SeqIO.index any help would be appreciated thanks

sequence bio python • 2.5k views
ADD COMMENT
0
0
Entering edit mode

thanks but I am not trying to split the original fasta file. I want to read a multi fasta file (contig 1, 2, ...501..etc) for each seq in the file run the script to slice the sequences into 40 nt chunks from 100 upstream to 500 upstream and output those query seq into a seprate Fasta file for each seq in the original fasta. I apologize if I am missing something in your answer.

ADD REPLY
0
Entering edit mode

I see. Should it look like this post-solution?

Python: Slicing Sequences In Fasta File

ADD REPLY

Login before adding your answer.

Traffic: 1720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6