Question: awk command to count specific field
0
gravatar for saadleeshehreen
11 months ago by
saadleeshehreen60 wrote:

Hi, I have a file with the following content. Now I want to count how many of them have 1 in the field -n2.

Bacteroides fragilis,0
Bacteroides fragilis,0
Salmonella enterica,1
Salmonella enterica,1
Salmonella enterica,1
Bacteroides fragilis,0
.................................
..................................

I used the following command :

cat f1.txt | awk {'$2 == 1'} | wc -l

But it doesn't give me the answer. Please help!

command • 480 views
ADD COMMENTlink modified 11 months ago by cpad011211k • written 11 months ago by saadleeshehreen60

I am not able to Understand input file format. You can use following command to count number of 1 in field 2

grep -o ",1" input.txt  | wc -l
ADD REPLYlink written 11 months ago by MSM5590
1
gravatar for Vijay Lakhujani
11 months ago by
Vijay Lakhujani4.0k
India
Vijay Lakhujani4.0k wrote:

The weird format of your file (if indeed it is in this way) is out of anyone's understanding. But I ll explain how awk could work here provided a nicely formatted tab separated table

Consider your file (say file.txt) this way, the <tab> and <space> symbols are for representation, your actual file will have whitepspace (tabs and spaces) and corresponding positions shown in the file

Bacteroides<space>fragilis<tab>0
Bacteroides<space>fragilis<tab>0
Salmonella<space>enterica<tab>1
Salmonella<space>enterica<tab>1
Salmonella<space>enterica<tab>1
Bacteroides<space>fragilis<tab>0

Now if you say

awk '$2==1{print}' file.txt | wc -l

It may not work, because, by default the field separator which awk consider here is the first white space it encounters which in this case would be the space Bacteroides <space> fragilis

Hence, you must add a field separator -F

awk -F "\t" '$2==1{print}' file.txt | wc -l
ADD COMMENTlink written 11 months ago by Vijay Lakhujani4.0k
awk -F "," '$2==1' file.txt | wc -l
ADD REPLYlink modified 11 months ago • written 11 months ago by Friederike3.6k
1
gravatar for Pierre Lindenbaum
11 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

set the field separator 'F' to 'comma' and increase a value 'N' each time column 2 is '1'. At the end print the value of N.

awk -F, '($2==1){N++;}END{print N;}' file.txt

but I think most people would use

cut -d, -f 2 file.txt | grep -c -w 1
ADD COMMENTlink written 11 months ago by Pierre Lindenbaum119k

Hi, Thanks. I have other related problem. My file like this:

10 Lachnoclostridium sp.   0       0       0       0       1
11 Haemophilus ducreyi     0       0       0       0       1
12 Clostridiales bacterium 0       0       0       0       1
13 Escherichia albertii    0       1       0       0       1

It has 8 fields. I want to just count the lines which value =1 in field 7 and field 8. How can I do that? I used the following, but it's not the exact output.

awk '$4 == 0; $5 == 0; $6 == 0; $7 == 1; $8 ==1' file.txt

ADD REPLYlink modified 11 months ago by Pierre Lindenbaum119k • written 11 months ago by saadleeshehreen60

Hi, You can use following command

awk  '$7==1 && $8==1 {print}' input.txt
ADD REPLYlink written 11 months ago by MSM5590

or just awk '$7==1 && $8==1' input.txt

ADD REPLYlink modified 11 months ago • written 11 months ago by Pierre Lindenbaum119k

Thanks. It works for me. :)

ADD REPLYlink written 11 months ago by saadleeshehreen60

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted. Upvote|Bookmark|Accept

ADD REPLYlink written 11 months ago by Pierre Lindenbaum119k

try this as well:

awk '$7 && $8==1'  input.txt
ADD REPLYlink written 11 months ago by cpad011211k
0
gravatar for cpad0112
11 months ago by
cpad011211k
India
cpad011211k wrote:

Though OP wants solution in awk, here is datamash solution:

Species (organism) wide 0's and 1's count:

$ datamash -s -t "," -g 1,2 count 2 < test.txt | sed 's/,/\t/g'
Bacteroides fragilis    0   3
Salmonella enterica 1   3

Only 0's and 1's count:

$ datamash -s -t "," -g 2 count 2 < test.txt | sed 's/,/\t/g'
0   3
1   3

input (from OP):

 $ cat test.txt 
Bacteroides fragilis,0
Bacteroides fragilis,0
Salmonella enterica,1
Salmonella enterica,1
Salmonella enterica,1
Bacteroides fragilis,0
ADD COMMENTlink modified 11 months ago • written 11 months ago by cpad011211k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1815 users visited in the last hour