Question: TCGA level 3 gene expression dataset
0
gravatar for nikkitta.sa
4.2 years ago by
nikkitta.sa10
United States
nikkitta.sa10 wrote:

Hello,

I am looking at Breast cancer gene expression level 3 data on TCGA.

I downloaded a dataset of 139 text files and sifted through some files manually to look for my gene of interest and its expression value. However, I haven't had any luck finding it so far. 

I am thinking I ought to write a script that would give me -

the gene of interest, its expression value and the record name holding these values.

Is there a better way to go about this process? Are there any online tools I could use? Would there be any suggestions? I'd appreciate a discussion..bounce off some ideas. 

Please and Thank you. 

-N

 

ADD COMMENTlink modified 4.2 years ago by dario.garvan430 • written 4.2 years ago by nikkitta.sa10
1

Have you tried just using grep?

ADD REPLYlink written 4.2 years ago by Devon Ryan88k

Thanks Devon, I tried grep. 

Worked like a charm. 

How do I write my results to a new file? 

I used 

grep -r "GeneName" .

This gave me a list of FileName GeneName ExpressionValue

Thanks again. I'm working on writing these results to a file for future use.  

 

ADD REPLYlink written 4.2 years ago by nikkitta.sa10

Just use redirection, so grep -r "GeneName" some_file > the_output.txt

ADD REPLYlink written 4.2 years ago by Devon Ryan88k

I'm looking at the whole dataset as opposed to single file and here's the part of my output on the console -

./US82800149_251976013065_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.985666666666667
./US82800149_251976013053_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.13183333333333
./US82800149_251976012883_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.796166666666667
./US82800149_251976013058_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.155
./US82800149_251976012925_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.25033333333333
./US82800149_251976012919_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.1105
./US82800149_251976012940_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.405333333333333
./US82800149_251976012950_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.557
./US82800149_251976013090_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.74166666666667
./US82800149_251976013068_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.72983333333333
./US82800149_251976013047_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.02
./US82800149_251976012874_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.997
./US82800149_251976013082_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.573666666666667
./US82800149_251976012867_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.1965
./US82800149_251976013052_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.16416666666667
./US82800149_251976012945_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.549166666666667
./US82800149_251976013076_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -2.30916666666667
./US82800149_251976012908_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.34766666666667
./US82800149_251976012956_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.07033333333333
./US82800149_251976012870_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.51533333333333
./US82800149_251976012913_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -1.0305
./US82800149_251976012879_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.000333333333333338
./US82800149_251976012873_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt:FBXL10    -0.976

 

 

Need to write this to a file.

I do not want to move my files to a different directory..just write this o/p to a file

I was looking at this link here: http://alvinalexander.com/unix/edu/examples/grep.shtml

But I've only gotten more confused..

 

 

ADD REPLYlink written 4.2 years ago by nikkitta.sa10
1

So a full example would be something like

grep FBXL10 *lmean.out.logratio.gene.tcga_level3.data.txt > FBXL10.txt
ADD REPLYlink written 4.2 years ago by Devon Ryan88k

Right. Thanks Devon. 

I was trying to use

grep -r "FBXL10" . > outfile.txt

And i got the message - 
grep: input file ‘./outfile.txt’ is also the output

Why is that the case? the ' . ' would ask grep to go through all the subirectories and files to look for my gene of interest..Why would it consider outfile.txt as input? 

 

ADD REPLYlink written 4.2 years ago by nikkitta.sa10
1

FYI, this is getting rather off-topic, since this is just basic computer usage.

On all Unix derived systems (Mac OS X, Linux, etc.), "." means the current working directory. So if you tell grep to recursively search through everything in the directory where the output is, then it'll go through the output too. It's actually quite clever that that error is even caught. That's why my example used wild-cards. The alternative is to just put the output elsewhere: "> /home/whatever-your-username-is/filename.txt".

ADD REPLYlink written 4.2 years ago by Devon Ryan88k

Thanks a lot. Appreciate it. 

ADD REPLYlink written 4.2 years ago by nikkitta.sa10

Right, so just put  > some_file.txt at the end of your command as I did in my example.

ADD REPLYlink written 4.2 years ago by Devon Ryan88k
0
gravatar for dario.garvan
4.2 years ago by
dario.garvan430
Australia
dario.garvan430 wrote:

TCGA-Assembler imports all of the data you want into R and makes a convenient table which you can use in other analyses.
 

ADD COMMENTlink written 4.2 years ago by dario.garvan430

Thanks Dario,

I'll look into it..

ADD REPLYlink written 4.2 years ago by nikkitta.sa10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1894 users visited in the last hour