Fasta length count format
1
0
Entering edit mode
4.7 years ago
baurumon ▴ 30

Hi, i have used awk '/^>/{if (l!="") print l; print; l=0; next}{l+=length($0)}END{print l}' seq.fasta to count sequence length. How can i get the output in below format.

Thanks in advance

head length

NC_035897.1:11929374-11930116 742

NC_035897.1:11929384-11930116 732

enter image description here

sequence • 1.0k views
ADD COMMENT
1
Entering edit mode

Use tr command to change the new line characters to a tab. It is unclear if you want to go in the other direction but if that is the case then switch tab and new lines.

ADD REPLY
0
Entering edit mode

hi genomax, I want head and length column. tr command is printing one after another. sorry for mistake.

ADD REPLY
1
Entering edit mode

See RamRS's solution on: Sequence length from Fasta

ADD REPLY
0
Entering edit mode

i asked my colleague to install seqkit but it going to take time. is there any way to do it with sed or awk?

ADD REPLY
1
Entering edit mode

It is advisable to search for this sort of problems on Google first before it is posted on this platform, please take a look at Brief Reminder On How To Ask A Good Question.

A simple Google search provided this and it could be the solution you're after - https://www.danielecook.com/generate-fasta-sequence-lengths/

ADD REPLY
2
Entering edit mode
4.7 years ago
kloetzl ★ 1.1k

This should do it:

awk '/^>/{if (l!="") print l; print; l=0; next}{l+=length($0)}END{print l}' seq.fasta | paste - -
ADD COMMENT

Login before adding your answer.

Traffic: 1976 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6