RNAseq data and PAM50 method
4
3
Entering edit mode
9.9 years ago
mriera.pique ▴ 30

Hello everyone,

I'm currently perfoming my master final project and I am analysing RNA-seq data of breast cancer. I need to find groups in the data, and after using different clustering methods, I would like to try PAM50 method.

I found many papers which spoke about PAM50, and "genefu" package information, but I don't find any protocol where it is explained how to perform a PAM50 study.

If someone knows where I could find it, please let me know it.

Thank you so much,

Maria

rna-seq PAM50 Breast Cancer unsupervised • 7.9k views
ADD COMMENT
2
Entering edit mode
ADD REPLY
0
Entering edit mode

Hi Maria, did you solve the problem you had 5 months ago? If so, would you share how? Thanks :)

ADD REPLY
0
Entering edit mode

I also ran into the same issue :

I have an gene expression matrix with samples as rows and genes (symbols) as columns (not probes)

I created an annotation matrix with the gene symbols and de EntrezGene_ID.

when running the intrinsic.cluster function on my data I get:

no probe in common -> annot or mapping parameters are necessary for the mapping process

When looking at the function code this originates from

> all(!is.element(dimnames(agilentData)[[2]], pam50))
[1] TRUE

This should be false as there are genes from Pam50 present in my data as can be seen in this way :

> all(!is.element(dimnames(agilentData)[[2]], rownames(pam50$centroids)))
[1] FALSE

Any help would be greatly appreciated

ADD REPLY
0
Entering edit mode

Please post this as a new question.

ADD REPLY
4
Entering edit mode
9.0 years ago
zamalloa ▴ 40

In case anyone is still looking for an answer, I had the same problem and was able to fix it. First, make sure you have an annotation dataset. They can be found in bioconductor (i.e., annot.nkis, org.Hs.egALIAS2EG). Then make sure that, if not already, this annotation dataset has the column name "EntrezGene.ID" plus whatever other columns in the dataset present such as gene_name. Now, if you are doing predictions (such as for PAM50), make sure you are using intrinsic.cluster.predict() and not intrinsic.cluster(). At the end you should have your 3 objects for your predictions: the pam50 model found in genefu (data(pam50)), your matrix that SHOULD HAVE SAMPLES IN ROWS AND GENES IN COLUMNS (this seems to be the most confusing part) and lastly the annotation data.frame with a column name "EntrezGene.ID."

Hope it helps

Jose

ADD COMMENT
1
Entering edit mode
9.9 years ago

Did you read the vignette for the genefu package, paying particular attention to the intrinsic.cluster() function and its intrinsicg option?

ADD COMMENT
0
Entering edit mode

Thank you so much for the information. I'm trying to performing intrinsic.cluster() function as you suggested me and I have obtained the following error:

Error in intrinsic.cluster(data = counts, annot = annot, do.mapping = FALSE,  :
  no probe in common -> annot or mapping parameters are necessary for the mapping process!

being annot (Matrix of annotations with at least one column named "EntrezGene.ID", dimnames being properly defined)

                EntrezGene.ID
ACTR3B          57180
ANLN            54443
BAG1              573
BCL2              596          [...]

and intrinsicg:

                  probe   EntrezGene.ID
ACTR3B   ACTR3B         57180
ANLN       ANLN         54443
BAG1       BAG1           573
BCL2       BCL2           596     [...]

both are obtained from PAM50 data.

I did not find what it is wrong.

Thanks for your help.

Maria

ADD REPLY
0
Entering edit mode

I made this an answer because op asked for a "protocol" for PAM50 and a vignette is a close as it gets to a protocol.

ADD REPLY
0
Entering edit mode
8.2 years ago

I'm using also genefu package to do the PAM50 prediction from RNA-seq data, my question es, should I use raw data (read counts) or should I transform my data prior the prediction?

ADD COMMENT
0
Entering edit mode

Please ask this as a new question.

ADD REPLY
0
Entering edit mode
6.9 years ago
ddjima2014 • 0

Can you guys help.

I have a signature genes similar to pam50 from other tumor. How should I use "intrinsic.cluster" OR "intrinsic.cluster.predict()" function in genefu package to do nearest centroid classifier?

I run intrinsic.cluster() with the 3 centroid with the signature genes, BUT I got cluster.1, cluster.2 and cluster.3. How do I associate this cluster to the centroid subtype in my signature genes?

Data used to do "intrinsic.cluster())

  1. 700 samples x 20000 genes expression as column
  2. 20000 genes x 3 column of annotation
  3. Annotation of the 500 signature genes.

I don't know how to tell incorporate a predefined subtype I have ?

Appreciate your input?

ADD COMMENT

Login before adding your answer.

Traffic: 2409 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6