Question

awk, count and print the number of lines following a separator then add value of header to that column

1

Entering edit mode

2.9 years ago

Tash ▴ 20

Hello. I've been trying to use awk to solve the following problem for some time with no luck. Was hoping someone here could give me some clues as to what I'm doing wrong.

I have a file that looks like this:

INPUT:

>chr8:76290516-76290880
   578  T
   579  G
   580  A
>chr14:22131464-22132025
   468  T
   469  G
   470  A
>chr12:33695439-33695441
   468  T
   469  G
   470  A

Each record in the file has a header that starts with > I would like to print a new column which is essentially a line number, starting after the header which starts with >, and to begin counting from the number following the : in the header. I would like each record (after the header >) in the file to be treated/counted independently.

I have tried to do this in steps, starting with adding the counts to the third column first, and then will attempt to add the header value to the counts following that. I have had no luck with getting the counts in the third column using this command below:

awk '{FS = "/n"}{RS = ">"}{if(!/^>/){print $1, $2, NF, $3 }}' input.txt

DESIRED OUTPUT:

>chr8:76290516-76290518
   578  T   76290516
   579  G   76290517
   580  A   76290518
>chr14:22131464-22131466
   468  T   22131464
   469  G   22131465
   470  A   22131466
>chr12:33695439-33695441
   321  T   33695439
   322  G   33695440
   333  A   33695441

Any pointers would be very much appreciated. Thank you so much!!

EDIT: I did not include my failed attempt to solve this problem, as Mensur Dlakic pointed out to me, so I have edited to include. Thank you.

awk • 1.7k views

ADD COMMENT • link updated 2.9 years ago by zx8754 11k • written 2.9 years ago by Tash ▴ 20

0

Entering edit mode

I think you are under a wrong impression that this is a coding service where you simply state your problem and expect someone to solve it for you, without any effort on your part. I am surprised you didn't ask for fries with it.

ADD REPLY • link 2.9 years ago by Mensur Dlakic ★ 27k

0

Entering edit mode

Thank you for taking the time to explain it to me, Mensur, this is my first post here. My apologies for not understanding how this works, I really didn't mean to cause offence or seem too entitled. I have been attempting this myself for some time with no luck, because I'm not very experienced. I'm so far from the mark I didn't think it was worth posting my failed attempts, however I will edit the post to include my latest attempt as a starting point.

ADD REPLY • link 2.9 years ago by Tash ▴ 20

score 3 · Accepted Answer · 2021-06-11

3

Entering edit mode

2.9 years ago

Pierre Lindenbaum 161k

  awk -F '[:]'  '/^>/ {print;P=$2;X=0;next;} {print $0, (P+X);X++;}' input.txt