Question: Add sequence lengths to headers in a fasta file
0
gravatar for jan
12 weeks ago by
jan0
jan0 wrote:

Hi all. I have a fasta file and would like to add the sequence lengths to the headers by keeping the sequences. Thank you!

My file:

>Seq_1
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTPQSKIAWISETLCIGCGI
KILAGKQKPNLGKYDDPPDWQEILTYFRGSELQNYFTKILEDDLKAIIKPQYVDQIPKAA
KGTVGSILDRKDETKTQAIVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQK

>Seq_2
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTSQSKIAWISETLCIGCGI
CIKKCPFGALSIVNLPSNLEKETTHRYCANAFKLHRLPIPRPGEVLGLVGTNGIGKSTAL
KGTVGSILDRKDETKTQTVVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQKADIFMF
DEPSSYLDVKQRLKAAITIRSLINPDRYIIV

Desired output (the length info can follow after any delimiter):

>Seq_1[174]
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTPQSKIAWISETLCIGCGI
KILAGKQKPNLGKYDDPPDWQEILTYFRGSELQNYFTKILEDDLKAIIKPQYVDQIPKAA
KGTVGSILDRKDETKTQAIVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQK

>Seq_2[211]
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTSQSKIAWISETLCIGCGI
CIKKCPFGALSIVNLPSNLEKETTHRYCANAFKLHRLPIPRPGEVLGLVGTNGIGKSTAL
KGTVGSILDRKDETKTQTVVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQKADIFMF
DEPSSYLDVKQRLKAAITIRSLINPDRYIIV
sequence • 207 views
ADD COMMENTlink modified 12 weeks ago by WouterDeCoster17k • written 12 weeks ago by jan0
3

Perhaps provide where you want to use it for and what you tried.... Are you seeking unix, perl or other programming approaches? Is scaling an issue (over millions of sequences or of extreme length)? Is the fasta header always that format or is something else to be expected as well?

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by ALchEmiXt1.8k
4
gravatar for shenwei356
12 weeks ago by
shenwei3563.0k
China
shenwei3563.0k wrote:

seqkit + awk on Linux/OS X

cat seqs.fa | seqkit fx2tab --length | awk -F "\t" '{print $1"["$4"]\t"$2}' | seqkit tab2fx > seqs2.fa

Example

$ echo -e ">seq\nacgtn\nACGTN"
>seq
acgtn
ACGTN

$ echo -e ">seq\nacgtn\nACGTN" | seqkit fx2tab --length | awk -F "\t" '{print $1"["$4"]\t"$2}' | seqkit tab2fx
>seq[10]
acgtnACGTN
ADD COMMENTlink modified 12 weeks ago • written 12 weeks ago by shenwei3563.0k

Great, that's exactly what I was looking for. Thank you!

ADD REPLYlink written 12 weeks ago by jan0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 676 users visited in the last hour