Question: assigning the values in matrix in bash
0
gravatar for smrutimayipanda
6 months ago by
smrutimayipanda10 wrote:

I have a matrix consisting of gene name column and log fold change column. I want to change the log fold change values in -1,1 and 0. if log fold change > 0, it should be 1; if logFC < 0, it should be -1 and if there is no values then it should be NA. Please tell me how to assign these values -1,1,0,NA in these matrix? I am comfortable in bash so please tell me in bash only. Thanks in advance.

bash microarray • 263 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by smrutimayipanda10

While this is doable in bash, arithmetic is really not its strong suit, so I would really advise you to get comfortable with a more versatile language such as R or python.

Please also provide some example input data for people to test with at the very least.

ADD REPLYlink modified 6 months ago • written 6 months ago by Joe18k
A3GALT2 9.80e-02  0.295935  3.58e-02
A4GALT 5.58e-02  0.2759222  4.21e-01
A4GNT -5.50e-03  -1.1805802  2.09e-01
AAAS -4.29e-01  -0.122598  2.22e-01
AACS -1.82e-02  -0.0618869  8.14e-02
AADAC 3.22e-02  0.6967785  -4.37e-01
AADACL2 2.97e-02  -1.8886345  -9.67e-03
AADACL3 -1.26e-01  2.3524335  -3.17e-02

This is the test file.

ADD REPLYlink modified 6 months ago by GenoMax95k • written 6 months ago by smrutimayipanda10

Please provide me the bash script for scientific notation Joe. I understood your logic.

ADD REPLYlink written 6 months ago by smrutimayipanda10
1

I've given you a functional skeleton, it shouldn't be hard for you to adapt it to scientific notation.

ADD REPLYlink written 6 months ago by Joe18k

But i didnt get the thing you said that the E would need to be substituted for a *10^ string. I am saying about this. What to add in the code?

ADD REPLYlink written 6 months ago by smrutimayipanda10

You should experiment for yourself and try things out. The are answers on StackOverflow for this.

But as I said, you'd be better off in a different language.

ADD REPLYlink written 6 months ago by Joe18k
2
gravatar for Joe
6 months ago by
Joe18k
United Kingdom
Joe18k wrote:

Question needs more information (such as the format of the input data etc.) as per my comment, but here's the basics of something functional:

#!/bin/bash

while IFS=',' read -r -a array ; do
  if [ -z "${array[1]}" ] ; then
    fc="NA"
  elif (( "${array[1]}" > 0 )); then
    fc=1
  elif (( "${array[1]}" < 0 )); then
    fc=0
  fi
  echo "${array[0]},$fc"
done < $1

Assuming an input file like:

$ cat test.csv
Gene1,1
Gene2,10
Gene3,
Gene4,-1
Gene5,-30

bash scriptname.sh test.csv will yield:

Gene1,1
Gene2,1
Gene3,NA
Gene4,0
Gene5,0

NOTE:

bash cannot do floating point arithmetic (unless you subprocess to bc or something), so if you have floating point data, this is why I would strongly urge the use of Python/R or something as per my comment also. Additionally this is why it's important to provide example input data.

Floating point calculations would require something like the following:

#!/bin/bash

while IFS=',' read -r -a array ; do
  if [ -z "${array[1]}" ] ; then
    fc="NA"
  elif (( $(echo "${array[1]} > 0" | bc)  )); then
    fc=1
  elif (( $(echo "${array[1]} < 0" | bc) )); then
    fc=0
  fi
  echo "${array[0]},$fc"
done < $1

Which for the input:

Gene1,1
Gene2,10
Gene3,
Gene4,-1
Gene5,-30
Gene6,1.234
Gene7,0.999999999999999999999999999999999
Gene8,-0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000001

will give:

Gene1,1
Gene2,1
Gene3,NA
Gene4,0
Gene5,0
Gene6,1
Gene7,1
Gene8,0

As a final comment, this will still not work for standard scientific notation for exponents (1E+10 etc.). In order for that to work with bc, the E would need to be substituted for a *10^ string.

ADD COMMENTlink modified 6 months ago • written 6 months ago by Joe18k
1
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:
$ echo -e "gene1 100.2\ngene2 -100.4\ngene3" | awk '{OFS="\t";V=($2==""?"NA":($2>=0?1:-1)); print $1,V;}' 
gene1   1
gene2   -1
gene3   NA
ADD COMMENTlink written 6 months ago by Pierre Lindenbaum133k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1469 users visited in the last hour
_