Question: select DE genes
2
gravatar for mms140130
5 weeks ago by
mms14013050
mms14013050 wrote:

Hello,

I have normalized gene expression data for 1095 patients with breast cancer part of the data as follows

      patient 1    patient 2   patient3   patient4 patient5  patient6 

    AASS    80.8588 135.3218    158.7152    20.7441 187.836 126.2016    
    AATF    2012.3344   990.1661    727.4445    1498.5344   1329.6371   
    AATK    179.534 209.8278    35.4275 13.5558 99.1263 51.2694 
    ABAT    2086.3408   77.9285 600.3779    101.0147    1564.1801 1439.816                                                                              
    ABCA11P 79.5614 91.9899 152.9806    65.3844 36.7641 85.9551 
    ABCA12  43.8556 1.0531  37.317  10.823  82.3253 4.9298   
    ABCA13  21.9278 2.6327  0.9447  0.9019  0.672   0

can I using this data find differentially expressed genes, what package in R can I used to get the DE genes?

rna-seq R gene • 229 views
ADD COMMENTlink modified 5 weeks ago by TriS2.7k • written 5 weeks ago by mms14013050
2

So you want to find differentially expressed genes, but you only have one group. Do you even know what differential expression means?

In a differential expression analysis you want to compare the expression of one group (e.g. patients) with another group (e.g. healthy controls). You want to find out which genes are differentially expressed (over- or underexpressed) in patients versus the control group.

ADD REPLYlink written 5 weeks ago by WouterDeCoster17k

Sorry but I'm learning how to analyze genomic data I have understood that the DE should be between 2 groups ( normal , tumor ) So I guess my question should be how to visualize the distribution of 20,000 genes in breast cancer patients

ADD REPLYlink written 5 weeks ago by mms14013050

What is the aim of your analysis? What is the biological question you are trying to solve?

ADD REPLYlink written 5 weeks ago by WouterDeCoster17k

I'm trying to see if there association between gene expression and genotype snp data One assumption is normally distributed in a regression I'm trying to find a way to visualize the distribution af the gene expression data

ADD REPLYlink written 5 weeks ago by mms14013050

association between gene expression and genotype snp data

That would be an eQTL analysis. You may want to have a look at this tutorial.

ADD REPLYlink written 5 weeks ago by WouterDeCoster17k

Is the package do the normal transform (log2(x+1)) or I have to do that before applying the eQTL since it uses regression and we have to validate the assumptions

ADD REPLYlink written 5 weeks ago by mms14013050
2

Asking for diff expressed genes was probably not the right question here.

Are you interested in classifying these breast cancer samples into sub-types? Like in this paper?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax28k
1

DEGs between individual patients or groups (cancer vs normal)?

ADD REPLYlink written 5 weeks ago by mbk0asis200

what I want is to find the DE genes and try to visualize the distribution of DE genes so I think it should be between phenotypes but I'm not sure ,

what is the difference between DE genes between samples and between (cancer, normal)

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by mms14013050
1

I think you need to think about a good research question first, before you can get some good answers.

ADD REPLYlink written 5 weeks ago by b.nota3.0k

Well I'm new to biology and genetics I'm trying my best

ADD REPLYlink written 5 weeks ago by mms14013050
1

I wasn't trying to dis you or put you down or anything. It's just that I see many scientists, new to bioinformatics, expecting to get answers without formulating a research question first. Bioinformatics is just like any other science, hence defining a research question first.

What is your goal with your data set? Know the differences between subsets of cancer? Or the difference between cancer and healthy? Etc.

ADD REPLYlink written 5 weeks ago by b.nota3.0k

are these fpkm values ?

ADD REPLYlink written 5 weeks ago by poisonAlien2.2k

They are TPM values

ADD REPLYlink written 5 weeks ago by mms14013050
2
gravatar for TriS
5 weeks ago by
TriS2.7k
United States, Buffalo
TriS2.7k wrote:

once you have a clear idea of how you want to compare your groups and you did a good Google search on how to do it, this is one of my favorite tutorials from Bioconductor:

https://www.bioconductor.org/help/workflows/RNAseq123/

however, it requires some knowledge of R.

also, don't forget to look at the BioStar Handbook

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by TriS2.7k
0
gravatar for mbk0asis
5 weeks ago by
mbk0asis200
Korea, Republic Of
mbk0asis200 wrote:

If all of your samples have the same phenotype (cancer for example), why would you want to find DEGs?

You should have at least one control sample to compare with.

Or you want to see the overall expression pattern of your data?

In that case, you are going to need a lot of computer power.

Data with ~1,000 samples x ~30,000 genes is too big to run on an ordinary PC.

ADD COMMENTlink written 5 weeks ago by mbk0asis200

I was trying to get DE genes since I have to get a visualization about the distribution of gene expression and they are 20,000 genes for 1095 cancer patients So mybe I can reduce the number of genes to get a good plot Is my thinking correct. Plz let me know?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by mms14013050
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 400 users visited in the last hour