Question: extract several fasta file with a list of ID (in order)
0
gravatar for Darill
21 months ago by
Darill40
Darill40 wrote:

I have a file with several names such :

seq1 seq9 seq3 seq7 seq5 seqi seqn....

and another fasta file with all my sequences, and what I need to do is to order my sequences by the order of the list above:such as:

>seq1
aaaaa
>seq9
aaaaa
>seq3
aaaaa
>seq7
aaaaa
>seq5
aaaaa
...

I tried this:

input_file = open('concatenate_0035_0042_aa2.fa','r')
output_file = open('result.fasta','a')


liste=['seq1','seq5','seq8' etc]
print(len(liste))
compteur=1
for i in liste:
    record_dict = SeqIO.to_dict(SeqIO.parse("concatenate_0035_0042_aa2.fa", "fasta"))
    print(">",record_dict[i].id,file=output_file,sep="")
    print(record_dict[i].seq,file=output_file)
    compteur+=1
    print(compteur)

output_file.close()
input_file.close()

but it actually takes too much time.

bio python fasta • 331 views
ADD COMMENTlink written 21 months ago by Darill40
1

There is a big number of solutions, and as Pierre said, an even bigger number of previous posts with the same question. You may use samtools faidx, as per my answer here.

ADD REPLYlink written 21 months ago by h.mon29k

Thanks you for you help.

ADD REPLYlink written 21 months ago by Darill40

this question has been asked a large number of times here. Please search for this question. Nevertheless, regarding your code, instead of doing your loop liste-nth times. How about scanning the fasta only one time and check if the current fasta name is in your liste ?

ADD REPLYlink written 21 months ago by Pierre Lindenbaum125k
1

Ok, thanks you, now it work :)

input_file = open('concatenate_0035_0042_aa2.fa','r') output_file = open('result.fasta','a')

compteur=1
record_dict = SeqIO.to_dict(SeqIO.parse("concatenate_0035_0042_aa2.fa", "fasta"))
for i in liste:
    if i in record_dict:
        print(">",record_dict[i].id,file=output_file,sep="")
        print(record_dict[i].seq,file=output_file)

output_file.close()
input_file.close()
ADD REPLYlink written 21 months ago by Darill40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1775 users visited in the last hour