Question: (Closed) Perl to calculate average in csv based on information at end of file
0
3.4 years ago by
cara7810
EU
cara7810 wrote:

I have a CSV file with three columns in order called Mb_size, tax_id, and parent_id. There is a relationship between tax_id and parent_id, for example, in the csv file at the end where you have 22.2220658537 for the mb size, 5820 is the tax id and 5819 is the parent id. As move up the file 5819 the parent id will be seen in the tax id column. The parent id can be repeated but tax id is uniqie in its column.

Starting at the end which has values in Mb_size, I need to work up to the top calculating the average everytime the parent_id becomes the tax_id.

Below is the sample csv file input:

Mb_size,tax_id,parent_id
,1,1
,131567,1
,2759,131567
,5819,2759
,147429,2759
22.2220658537,5820,5819
184.801317,4557,147429
748.66869,4575,147429
555.55,1234,5819

Below is the sample csv file output:

Mb_size,tax_id,parent_id
377.810518214,1,1
377.810518214,131567,1
377.810518214,2759,131567
288.886032927,5819,2759,
466.7350035,147429,2759
22.2220658537,5820,5819
184.801317,4557,147429
748.66869,4575,147429
555.55,1234,5819

The code I have to far. I cant get it to continue the calculation up and include in the printing the original below.

use strict;
use warnings;

open taxa_fh, '<', "\$ARGV[0]" or die qq{Failed to open "\$ARGV[1]" for input: \$!\n};
open match_fh, ">\$ARGV[0]_sized.csv" or die qq{Failed to open for output: \$!\n};

my %data;

while ( my \$line = <taxa_fh> ) {

chomp( \$line );

my @fields    = split( /,/, \$line );
my \$Mb_size   = \$fields[0];
my \$tax_id    = \$fields[1];
my \$parent_id = \$fields[2];

\$data{\$parent_id}{sum} += \$Mb_size;
\$data{\$parent_id}{count}++;
}

for my \$parent_id ( sort keys %data ) {
my \$avg = \$data{\$parent_id}{sum} / \$data{\$parent_id}{count};
print match_fh "\$tax_id, \$avg \n";

}

close taxa_fh;
close match_fh;

average csv perl • 990 views
written 3.4 years ago by cara7810

Hello cara78!

We believe that this post does not fit the main topic of this site.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!