Scientific Notation to Decimal
2
2
Entering edit mode
7 weeks ago
am29 ▴ 30

Hi,

I have a huge list of p-values ranging from E-05 to E-324. I want to use a toll (Cropper) but it requires decimal numbers as input. How can I transform scientific notations to decimals, all at once?

notation decimal scientific • 698 views
0
Entering edit mode

What is toll? What format are the numbers in (i.e. text file? excel? etc.)? What sort of tools are you familiar with? (i.e. linux, R, python, shell?)

0
Entering edit mode

I have txt file, and I work in both Linux and R. The tool for which I need a decimal format numbers is Cropper: https://genomics.ut.ee/en/tools

5
Entering edit mode
7 weeks ago

I'd first question if your program really doesn't accept scientific notation, if so that creates problems for working with these values and you should consider using a properly implemented program. The following example shows what I mean: In principle, the following bash code does the conversion just fine:

while read -r line;
do
printf "%.325f\n" \$line ; ## change format to %.308f as minimum sensible representation
done < testfile.txt


You need to decide for some sensible cutoff of precision, I am using 325 just to show the problem:

1E-5
1E-100
1E-324


Output:

0.00001000000000000000081803053914031309545862313825637102127075195312500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000199918998026028836196477607885341594201826030059365956992555434676176762886132929895827460748109118507985282705397496540222684360419612636083562831412787179427249289424690806658916305930004345786023014502507945

0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000


Note: 1E-324 is effectively zero in this notation but maybe smaller than the smallest positive non-zero double precision value. On 64bit platforms that is 2e-308 **

Of course you could do some trickery by having variable precision floating point numbers, but you rather don't want that.

** checked with the c program:

#include <float.h>
#include <stdio.h>
int main()
{
printf("%.e\n", DBL_MIN);
}

2
Entering edit mode

Something to consider: Cropper makes manhattan plots - so it's plotting the -log transformed p-values on the y-axis. It could be that the required input is transformed values (I don't know), which would be much more reasonable than a decimal value with 324 zeros! :)

2
Entering edit mode

Makes perfect sense, but I have no clue about that software. One can make the plots in R which would be my first choice, including the steps number parsing and transformation it will likely be much less of a hassle.

0
Entering edit mode

Thank you all for the quick and useful answers!

1
Entering edit mode
7 weeks ago

Microsoft excel can be useful. You can import data into it and change the number format

0
Entering edit mode

Unfortunately, I have tens of thousand of rows and it exceeds Excel limit

2
Entering edit mode

It's easy to do with awk's printf, see e.g. here