Question: Trouble using if string in column 3 of .csv exist, then statement
0
gravatar for umn_bist
3.1 years ago by
umn_bist320
umn_bist320 wrote:

I have a set of 10 TCGA files. 5 are unique tumors tissues and the other 5 are matching normal tissues… So {tumor_A, normal_A, tumor_B, normal_B,… tumor_E, normal_E}

These files each have a random ID, which is the name of its directory. I have kept an Excel spreadsheet tracking which file is a tumor, normal, and which belong together… So

column A (barcode) | column B (filetype) | column C (file ID, directory name)

    14124421                  TP             1j1iulhkassdalkshdka
    12564122                  NT             900110d109jd109jdasd
    
    64343343                  TP             01920912i409asdaojoj
    85546455                  NT             901i901i2901i2049i12
    
    46346464                  TP             0910912091409109klka
    46435435                  NT             091i0dkajakakajkjh2a

Thus far, I found awk,sed, grep function using CSV file. How can I build a if then statement?

  •  if the directory name exists in the 3rd column of CSV file then copy the string in 2nd column variable in the same row.
  • if this string is TP, store the file within the directory under $TUMOR and copy the string right below the directory name (ID of normal) and search for the directory and store the file inside this directory under $NORMAL.
  • going back to 2nd bullet, if the string was not TP, do nothing and move along

Example

if "1j1iulhkassdalkshdka" exists in column 3 of CSV file
     store string of column 2 of the same row
     if stored string is TP
          store file in ~/foo/bar/1j1iulhkassdalkshdka/ as $TUMOR
          store string right below 1j1iulhkassdalkshdka (which is 900110d109jd109jdasd)
          in ~/foo/bar/${string} assign file inside to $NORMAL
          run my tool using $TUMOR and $NORMAL
          erase $TUMOR and $NORMAL links
bash tcga • 851 views
ADD COMMENTlink written 3.1 years ago by umn_bist320

While you could do this in bash, it'd probably be a bit simpler in python or perl.
 

ADD REPLYlink written 3.1 years ago by Devon Ryan88k

The reason I would like to use bash is because our HPC cluster has a SLURM scheduler to perform pipelines via bash script.

I am new to python so I am wondering how similar bash and python scripting is. The one thing I am afraid I'll have to recode is the code in bash script can be used directly in the terminal (aka it's easy).

Would python/perl allow me to script using similarly easy commands? Or is it more like C++ (I am comfortable with C++, I'd just prefer not to use such a heavy duty soln if not required).

ADD REPLYlink written 3.1 years ago by umn_bist320

Python is a more robust language to work with and is generally preferable for scripts longer than about 10 lines. It's similar to bash in that you don't need to compile it before hand, but it's also a bit closer to C++ in that it supports things like objects and libraries and has a more sane structure.

Anyway, you might also want to look at snakemake. I use it to run stuff with slurm on our cluster and it makes pipelining things relatively painless.

ADD REPLYlink written 3.1 years ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1766 users visited in the last hour