Off topic:Perl to calculate average in csv based on information at end of file
0
0
Entering edit mode
8.5 years ago
cara78 ▴ 10

I have a CSV file with three columns in order called Mb_size, tax_id, and parent_id. There is a relationship between tax_id and parent_id, for example, in the csv file at the end where you have 22.2220658537 for the mb size, 5820 is the tax id and 5819 is the parent id. As move up the file 5819 the parent id will be seen in the tax id column. The parent id can be repeated but tax id is unique in its column.

Starting at the end which has values in Mb_size, I need to work up to the top calculating the average every time the parent_id becomes the tax_id.

Below is the sample csv file input:

Mb_size,tax_id,parent_id
,1,1
,131567,1
,2759,131567
,5819,2759
,147429,2759
22.2220658537,5820,5819
184.801317,4557,147429
748.66869,4575,147429
555.55,1234,5819

Below is the sample csv file output:

Mb_size,tax_id,parent_id
377.810518214,1,1
377.810518214,131567,1
377.810518214,2759,131567
288.886032927,5819,2759,
466.7350035,147429,2759
22.2220658537,5820,5819
184.801317,4557,147429
748.66869,4575,147429
555.55,1234,5819

The code I have to far. I cant get it to continue the calculation up and include in the printing the original below.

use strict;
use warnings;

open taxa_fh, '<', "$ARGV[0]" or die qq{Failed to open "$ARGV[1]" for input: $!\n};
open match_fh, ">$ARGV[0]_sized.csv" or die qq{Failed to open for output: $!\n};

my %data;

while ( my $line = <taxa_fh> ) {
    chomp( $line );

    my @fields    = split( /,/, $line );
    my $Mb_size   = $fields[0];
    my $tax_id    = $fields[1];
    my $parent_id = $fields[2];

    $data{$parent_id}{sum} += $Mb_size;
    $data{$parent_id}{count}++;
}

for my $parent_id ( sort keys %data ) {
   my $avg = $data{$parent_id}{sum} / $data{$parent_id}{count};
   print match_fh "$tax_id, $avg \n";
}

close taxa_fh;
close match_fh;
perl csv average • 2.2k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6