TCGA gene expression
1
1
Entering edit mode
8.8 years ago
fatima • 0

Hi

In my project we most use TCGA gene expression so I download one Batch like zip file then I don't know how to use it because I need gene expression to change it to binary code

Could you help me

genome • 3.7k views
ADD COMMENT
1
Entering edit mode

Can you clarify your problem and question? Where are you getting the TCGA data from--the Data Portal, Firehose data? I think you are saying you have a zip file that you downloaded. Have you unzipped it? What files do you have when you unzip it? Why do you want to change it to binary? What is your end goal? It kind of sounds like a class project from the current description.

ADD REPLY
0
Entering edit mode

I download from https://tcga-data.nci.nih.gov/tcga/dataAccessMatrix.htm?mode=ApplyFilter&diseaseType=COAD one batch then it is zip file contain expression-genes ,METADATA,file_annotation...,file-manifest.txt,file_sampel_M...,readme... I extract expression genes file but I don't know how to use it.

I want to change it to binary code for gene express or not then with GIMME I make for it algorithm then compare flux their cancer patient biochemical path and flux

ADD REPLY
1
Entering edit mode

What are the options that you have selected for downloading the data? What kind of data is it? RNASeqV2?

If it is RNASeqV2 gene expression data, than after unzipping the file, you will get a sub-directory RNASeqV2. If you go inside, you will get some files with barcodes. files with extension: rsem.genes.normalized_results will contain genes names and their normalized counts. You could use these. Try to elaborate your query, if you are not looking for this.

ADD REPLY
0
Entering edit mode

Thanks for reply but my option is gene expression only. I need gene expression for analysis flux so I select only gene expression option and all batch and level. Am I right?

How to use onzip file and change to binary to integrate to matlab?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

This file is a Level 1 gene expression microarray file. It is unprocessed data. If you want the gene expression levels already processed then you need to look at the Level 3 data (under the "3" column of the data matrix). The Level 3 file associated with the one you posted above is named "US82800149_251976011004_S01_GE2_105_Dec08.txt_lmean.out.logratio.gene.tcga_level3.data.txt". And it looks like the following:

Hybridization REF    TCGA-A6-2670-01A-02R-0821-07
Composite Element REF    log2 lowess normalized (cy5/cy3) collapsed by gene symbol
ELMO2    0.587583333333333
CREB3L1    0.5685
RPS11    0.80825
PNMA1    -1.66125
MMP2    1.0735 
..

Please look at this file and let us know if you are still confused. Also, please read up on TCGA data levels so you can better navigate the files. An explanation of TCGA Data Levels can be found here (https://tcga-data.nci.nih.gov/tcga/tcgaDataType.jsp).

ADD REPLY
0
Entering edit mode

Thanks alolex I am searching this link but not connect to it

ADD REPLY
0
Entering edit mode

Try it now. I corrected it I think.

ADD REPLY
0
Entering edit mode

Sorry how to change level 3 to binary code?

I have many problem thanks for give me time and reply to my raw question

ADD REPLY
0
Entering edit mode

I'm not sure what you are really trying to do or accomplish by changing it to binary as most programs can process tab-delimited format just fine. I'm thinking my interpretation of what you are saying is not what you mean. Can you elaborate some? What program are you trying to use, are you trying to do differential expression analysis, what is the biological question you are asking of the data?

ADD REPLY
0
Entering edit mode

Comparison of biochemical pathways of colorectal cancer patients: Using the tools of systems biology and identified biochemical pathways gaps

so I need TCGA data and GIMME to create model for colorectal cancer so before model I must change gene expression data of 270 patient to binary code maybe then I integrate it to MATLAB with COBRA package install ...

ADD REPLY
0
Entering edit mode

Thanks alolex you help me very much, my teacher is like you and said go and find :)

I most use cut off for labeling maybe ...

I pleased to link to you in linkedin

Fatemeh Nikmanesh

ADD REPLY
0
Entering edit mode

You are on the right track. Check out this paper for ideas on what to do. You might not need to do anything this complicated but hopefully it will give you some more terms to search for and additional references to read.

ADD REPLY
0
Entering edit mode

tnx for ur reply ...

ADD REPLY
0
Entering edit mode
8.8 years ago
alolex ▴ 950

Oh, I now know what you are doing and it is VERY obvious that this is a class assignment or independent study problem or something similar. You really need to go talk to your advisor/teacher about any problems you are having with this assignment--or even a classmate. Secondly, you are trying to create a mathematical model of a metabolic pathway, which is computational systems biology type of work. This forum is for Bioinformatics and not computational systems biology, which differs in many ways to bioinformatics.

On the conversion to binary code, I figured out what you were trying to do when I found your other post here. Because I think this is an assignment I won't give you a straight-up answer, but rather ask you to think about it in this way: You have a set of continuous data and you want to convert it to 0 and 1, or rather think of the 0 and 1 as "OFF" and "ON", respectively. You need to figure out how to label the genes as "OFF" and "ON" in your system.

Good luck with your project, and please contact your instructor. While I have created mathematical models in the past I have no experience in the specific tools you list (GIMME, COBRA), and currently don't have access to MATLAB.

ADD COMMENT

Login before adding your answer.

Traffic: 2770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6