Question: is there any solution to convert a multifasta file to csv?
0
projetoic • 0 wrote:
I am trying to transform Multifasta - with many sequences fasta to a csv table. But I got the following error
is there any solution to convert a fasta file to csv?
from Bio import SeqIO
for re in SeqIO.parse('CHIKV1-X-gb_AB455493.fasta', 'fasta'):
print('>{}\t{}'.format(re.description, re.id))
and
import fastatocsv
fastatocsv.converter.convert("CHIKV1-X-gb_AB455493.fasta","zikanovo.csv")
Until then, you can only get solutions that return two columns or return only the id and the sequence. Or everything is returned
expected exit Output:
column1 column2..
gb:AB455493 Organism:Chikungunya virus Strain Name:SL11131 Segment:nul Host:Human AATGG
gb:AB455493 Organism:Chikungunya virus Strain Name:SL11131 Segment:nul Host:Human AATGG
gb:AB455493 Organism:Chikungunya virus Strain Name:SL11131 Segment:nul Host:Human AATGG
myseq:
>gb:KX262887|Organism:Zika virus|Strain Name:103451|Segment:null|Subtype:Asian|Host:Human
GTTGTTGATCTGTGTGAATCAGACTGCGACAGTTCGAGTTTGAAGCGAAAGCTAGCAACAGTATCAACAG
GTTTTATTTTGGATTTGGAAACGAGAGTTTCTGGTCATGAAAAACCCAAAAAAGAAATCCGGAGGATTCC
>gb:KX262887|Organism:Zika virus|Strain Name:103451|Segment:null|Subtype:Asian|Host:Human
GTTGTTGATCTGTGTGAATCAGACTGCGACAGTTCGAGTTTGAAGCGAAAGCTAGCAACAGTATCAACAG
GTTTTATTTTGGATTTGGAAACGAGAGTTTCTGGTCATGAAAAACCCAAAAAAGAAATCCGGAGGATTCC
>gb:KX262887|Organism:Zika virus|Strain Name:103451|Segment:null|Subtype:Asian|Host:Human
GTTGTTGATCTGTGTGAATCAGACTGCGACAGTTCGAGTTTGAAGCGAAAGCTAGCAACAGTATCAACAG
GTTTTATTTTGGATTTGGAAACGAGAGTTTCTGGTCATGAAAAACCCAAAAAAGAAATCCGGAGGATTCC
ADD COMMENT
• link
•
modified 4 weeks ago
by
Mensur Dlakic • 9.0k
•
written
4 weeks ago by
projetoic • 0
What on earth is a multi-folder file? Given that your header is in a custom format, you're going to have to parse it using custom code.
Multifasta - with many sequences fasta. I'm going to edit sorry. Is there any solution?
You're halfway there.
re.id + redescription
should give you the whole header, make sure of that. Once you have it, you can split by|
and then spit each element of that by:
into key value pairs. Then, these key value pairs along withre.seq
will give you the columns for eachre
. Write these attributes separated by,
and you'll have your CSV.Thanks!!!