Question: Add sequence lengths to headers in a fasta file
0
gravatar for jan
5 months ago by
jan0
jan0 wrote:

Hi all. I have a fasta file and would like to add the sequence lengths to the headers by keeping the sequences. Thank you!

My file:

>Seq_1
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTPQSKIAWISETLCIGCGI
KILAGKQKPNLGKYDDPPDWQEILTYFRGSELQNYFTKILEDDLKAIIKPQYVDQIPKAA
KGTVGSILDRKDETKTQAIVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQK

>Seq_2
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTSQSKIAWISETLCIGCGI
CIKKCPFGALSIVNLPSNLEKETTHRYCANAFKLHRLPIPRPGEVLGLVGTNGIGKSTAL
KGTVGSILDRKDETKTQTVVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQKADIFMF
DEPSSYLDVKQRLKAAITIRSLINPDRYIIV

Desired output (the length info can follow after any delimiter):

>Seq_1[174]
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTPQSKIAWISETLCIGCGI
KILAGKQKPNLGKYDDPPDWQEILTYFRGSELQNYFTKILEDDLKAIIKPQYVDQIPKAA
KGTVGSILDRKDETKTQAIVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQK

>Seq_2[211]
MADKLTRIAIVNHDKCKPKKCRQECKKSCPVVRMGKLCIEVTSQSKIAWISETLCIGCGI
CIKKCPFGALSIVNLPSNLEKETTHRYCANAFKLHRLPIPRPGEVLGLVGTNGIGKSTAL
KGTVGSILDRKDETKTQTVVCQQLDLTHLKERNVEDLSGGELQRFACAVVCIQKADIFMF
DEPSSYLDVKQRLKAAITIRSLINPDRYIIV
sequence • 306 views
ADD COMMENTlink modified 5 months ago by WouterDeCoster20k • written 5 months ago by jan0
3

Perhaps provide where you want to use it for and what you tried.... Are you seeking unix, perl or other programming approaches? Is scaling an issue (over millions of sequences or of extreme length)? Is the fasta header always that format or is something else to be expected as well?

ADD REPLYlink modified 5 months ago • written 5 months ago by ALchEmiXt1.8k
4
gravatar for shenwei356
5 months ago by
shenwei3563.2k
China
shenwei3563.2k wrote:

seqkit + awk on Linux/OS X

cat seqs.fa | seqkit fx2tab --length | awk -F "\t" '{print $1"["$4"]\t"$2}' | seqkit tab2fx > seqs2.fa

Example

$ echo -e ">seq\nacgtn\nACGTN"
>seq
acgtn
ACGTN

$ echo -e ">seq\nacgtn\nACGTN" | seqkit fx2tab --length | awk -F "\t" '{print $1"["$4"]\t"$2}' | seqkit tab2fx
>seq[10]
acgtnACGTN
ADD COMMENTlink modified 5 months ago • written 5 months ago by shenwei3563.2k

Great, that's exactly what I was looking for. Thank you!

ADD REPLYlink written 5 months ago by jan0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1100 users visited in the last hour