Question: How To Convert A Plain Squence In A Perl String Variable To Fasta Format
0
gravatar for Curious Mind
5.9 years ago by
Curious Mind10
Curious Mind10 wrote:

Hi,

I have a couple of plain sequences (without any formatting and annotations) saved in sqlite3 database. After reading them into Perl strings, I have to convert them into fasta format using Perl. I see methods in Bio::SeqIO to convert one file format to another. But my sequences are in perl string variables and not in files.

Thanks

perl fasta bioperl • 2.9k views
ADD COMMENTlink modified 5.9 years ago by Jelena Aleksic900 • written 5.9 years ago by Curious Mind10
3
gravatar for Jelena Aleksic
5.9 years ago by
Cambridge, UK
Jelena Aleksic900 wrote:

I don't see why a library is necessarily helpful here? Fasta format is super simple, so all you'd need to do is come up with some IDs, then print something like: print ">$NewId\n$sequence\n";

I'd be interested to know why people use libraries in this context, as I might be missing something.

ADD COMMENTlink written 5.9 years ago by Jelena Aleksic900
2

Quite right; printing out Fasta is simple without libraries (you might want to include a line wrap since sequence lines are in principle not supposed to exceed 80 characters). I just explained the Bioperl solution because the OP was using Bio::SeqIO and seemed confused by it.

ADD REPLYlink written 5.9 years ago by Neilfws48k

A simple line wrapping can be obtained using unpack:

print '>', $seqId, "\n"; foreach $seqLine (unpack('(a[60])*', $seqStr)) { print $seqLine, "\n"; }

ADD REPLYlink written 5.9 years ago by Hamish3.1k
1

I cannot see any reason why a library would be useful. This solution is simple, fast and scalable.

ADD REPLYlink written 5.9 years ago by BruceB320
1
gravatar for Neilfws
5.9 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

The Bio::SeqIO HOWTO might be helpful.

To create sequences using Bioperl requires Bio::Seq in addition to Bio::SeqIO.

Assuming that your sequence string is $string and you want to write to file myFile.fa:

#!/usr/bin/perl -w

use strict;
use Bio::SeqIO;
use Bio::Seq;

my $string   = "acaaaatcttgagagatt";
my $seq      = Bio::Seq->new(-display_id => "mySeq1", -seq => $string);
my $outseq   = Bio::SeqIO->new(-format => "fasta", -file => ">myFile.fa");

$outseq->write_seq($seq);

Obviously without annotation, you will have to devise a sensible method to generate sequence IDs for the Fasta header.

ADD COMMENTlink written 5.9 years ago by Neilfws48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2180 users visited in the last hour