Question

Converting to decimal values

0

Entering edit mode

3.4 years ago

genomes_and_MGEs ▴ 10

Hi everyone,

I have a table like this

A   B   2.9711e-01  6.8662e-10  2/2048
A   C   2.3343e-03  0.0000e+00  1861/2048
A   D   2.9711e-01  6.8666e-10  2/2048
A   E   2.9711e-01  6.8658e-10  2/2048

I would like to convert column 5 into the decimal value, such as

    A   B   2.9711e-01  6.8662e-10  0,0009765625
    A   C   2.3343e-03  0.0000e+00  0,90869140625
    A   D   2.9711e-01  6.8666e-10  0,0009765625
    A   E   2.9711e-01  6.8658e-10  0,0009765625

Could someone help me out?

Cheers!

sequence • 1.3k views

ADD COMMENT • link updated 3.4 years ago by h.mon 35k • written 3.4 years ago by genomes_and_MGEs ▴ 10

1

Entering edit mode

How is this related to bioinformatics?

ADD REPLY • link 3.4 years ago by Ram 43k

score 3 · Answer 1 · 2020-11-27

This can do the trick:

$ perl -lane '($n,$d)=split(/\//,$F[4]); $F[4]=$n/$d; print join "\t", @F' < table.txt
A       B       2.9711e-01      6.8662e-10      0.0009765625
A       C       2.3343e-03      0.0000e+00      0.90869140625
A       D       2.9711e-01      6.8666e-10      0.0009765625
A       E       2.9711e-01      6.8658e-10      0.0009765625

score 3 · Answer 2 · 2020-11-27

You can try awk, too, assuming that your file's format is consistent (no empty cells; tab-delimited) and that there is indeed no header:

awk -F "\t" '{
  split($5, div, "/");
  print $1"\t"$2"\t"$3"\t"$4"\t"div[1]/div[2]}' table.txt

A   B   2.9711e-01  6.8662e-10  0.000976562
A   C   2.3343e-03  0.0000e+00  0.908691
A   D   2.9711e-01  6.8666e-10  0.000976562
A   E   2.9711e-01  6.8658e-10  0.000976562

Kevin

score 2 · Answer 3 · 2020-11-27

2

Entering edit mode

3.4 years ago

Devon Ryan 104k

$ awk '{split($5, v, "/"); OFS="\t"; print $1,$2,$3,v[1]/v[2]}' foo.txt
A       B       2.9711e-01      0.000976562
A       C       2.3343e-03      0.908691
A       D       2.9711e-01      0.000976562
A       E       2.9711e-01      0.000976562

ADD COMMENT • link 3.4 years ago by Devon Ryan 104k

score 2 · Answer 4 · 2020-11-27

Just to keep the tradition of questions with no clear relation to bioinformatics attracting lots of answers (and because it is Friday night at times of pandemics), here are some alternative solutions:

Using several bash variables:

while IFS=$'\t' read -r val1 val2 val3 val4 val5
do
    val5=$(echo "scale=10; $val5" | bc -l)
    printf "%s\t%s\t%s\t%s\t%s\n" "$val1" "$val2" "$val3" "$val4" "$val5"
done < table.tsv

With an array, slicing the array to delete the last element and replace by the result of the division:

while IFS=$'\t' read -r -a array
do
    val=$(echo "scale=10; ${array[-1]}" | bc -l)
    array=("${array[@]::${#array[@]}-1}")
    array+=($val)
    printf "%s\t" "${array[@]}"; echo
done < table.tsv

For modern (later than 4.2, I think) bash, one can use a nicer syntax to delete the last array element:

while IFS=$'\t' read -r -a array
do
    val=$(echo "scale=10; ${array[-1]}" | bc -l)
    unset array[-1]
    printf "%s\t" "${array[@]}"
    printf "%s\n" "$val"
done < table.tsv