Question: Sum up values for a specific column of multiple files
0
gravatar for dazhudou1122
7 months ago by
dazhudou1122110
dazhudou1122110 wrote:

Dear Biostar community,

I have ~200 files of mapping results from BBMap. Lets say I have file A.txt, and it looks like this:

#name   %unambiguousReads   unambiguousMB   %ambiguousReads ambiguousMB unambiguousReads    ambiguousReads  assignedReads   assignedBases
CP048304    0.00133 0.021248    0   0   147 0   158 22855
CP048305    0.00122 0.019355    0   0   135 0   146 20964
CP048306    0.00063 0.009802    0   0   69  0   81  11554
CP048307    0.0006  0.009519    0   0   66  0   78  11271
CP048308    0.00056 0.008937    0   0   62  0   76  10980
CP048309    0.00046 0.007286    0   0   51  0   57  8157
CP048310    0.00031 0.004859    0   0   34  0   38  5441
CP048311    0.00026 0.004082    0   0   29  0   38  5393
CP048312    0.00022 0.003489    0   0   24  0   34  4945
CP048313    0.00016 0.002588    0   0   18  0   22  3170
CP048314    0.00016 0.002498    0   0   18  0   30  4250

I want to sum up the column of "ambiguous reads" (the sixth column) and out put to a file, which the first column is the file name and second column is the sum value, like this:

A 653
B 550
C 375
...

I tried many methods after googling but none worked so far. Can you please help?

Thank you!

Best,

Wenhan

sequencing • 140 views
ADD COMMENTlink modified 7 months ago by lakhujanivijay5.3k • written 7 months ago by dazhudou1122110

There are plenty of quick and dirty ways to do it. What have you tried ??

ADD REPLYlink modified 7 months ago • written 7 months ago by geek_y11k
1
gravatar for lakhujanivijay
7 months ago by
lakhujanivijay5.3k
India/Ahmedabad
lakhujanivijay5.3k wrote:
awk -F '\t' ' {sum += $4} END {print sum}' A.txt

Explained here

ADD COMMENTlink written 7 months ago by lakhujanivijay5.3k

Dear lakhujanivijay,

Thank you! It worked great! I tried a lot of other solution but none of them work. Also thank you for the explanation. I will list the complete code below:

ls *.txt > samples.txt
while read -r file start end; do sum=$(awk -F '\t' ' {sum += $6} END {print sum}' "$file"); 
echo "$file $sum"; done < samples.txt > Summary.txt
ADD REPLYlink written 7 months ago by dazhudou1122110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1743 users visited in the last hour