Question: extract several fasta file with a list of ID (in order)
0
gravatar for Darill
12 months ago by
Darill30
Darill30 wrote:

I have a file with several names such :

seq1 seq9 seq3 seq7 seq5 seqi seqn....

and another fasta file with all my sequences, and what I need to do is to order my sequences by the order of the list above:such as:

>seq1
aaaaa
>seq9
aaaaa
>seq3
aaaaa
>seq7
aaaaa
>seq5
aaaaa
...

I tried this:

input_file = open('concatenate_0035_0042_aa2.fa','r')
output_file = open('result.fasta','a')


liste=['seq1','seq5','seq8' etc]
print(len(liste))
compteur=1
for i in liste:
    record_dict = SeqIO.to_dict(SeqIO.parse("concatenate_0035_0042_aa2.fa", "fasta"))
    print(">",record_dict[i].id,file=output_file,sep="")
    print(record_dict[i].seq,file=output_file)
    compteur+=1
    print(compteur)

output_file.close()
input_file.close()

but it actually takes too much time.

bio python fasta • 246 views
ADD COMMENTlink written 12 months ago by Darill30
1

There is a big number of solutions, and as Pierre said, an even bigger number of previous posts with the same question. You may use samtools faidx, as per my answer here.

ADD REPLYlink written 12 months ago by h.mon24k

Thanks you for you help.

ADD REPLYlink written 12 months ago by Darill30

this question has been asked a large number of times here. Please search for this question. Nevertheless, regarding your code, instead of doing your loop liste-nth times. How about scanning the fasta only one time and check if the current fasta name is in your liste ?

ADD REPLYlink written 12 months ago by Pierre Lindenbaum119k
1

Ok, thanks you, now it work :)

input_file = open('concatenate_0035_0042_aa2.fa','r') output_file = open('result.fasta','a')

compteur=1
record_dict = SeqIO.to_dict(SeqIO.parse("concatenate_0035_0042_aa2.fa", "fasta"))
for i in liste:
    if i in record_dict:
        print(">",record_dict[i].id,file=output_file,sep="")
        print(record_dict[i].seq,file=output_file)

output_file.close()
input_file.close()
ADD REPLYlink written 12 months ago by Darill30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1035 users visited in the last hour