Question: How to get ENTREZ ID and SYMBOL to my gene notation in Saccharomyces cerevisiae
0
gravatar for cesarihv7
4 weeks ago by
cesarihv70
cesarihv70 wrote:

i have this genes from saccharomyces cerevisiae

genes <- c("YAL002W", "YAL003W","YAL004W", "YAL005C","YAL007C","YAL008W","YAL009W","YAL010C", "ETS1-1", "ETS1-2","ETS2-1","ETS2-2", "HRA1", "ICR1", "IRT1", "ITS1-1")

and I have tried this

my.simbols <- genes

sc <- org.Sc.sgd.db select(sc, keys = my.simbols, columns = c("ENTREZID", "SYMBOL", "GENEID"), keytype = "SYMBOL")

and this is the ouput error

Error in testForValidKeytype(x, keytype) : Invalid keytype: SYMBOL. Please use the keytypes method to see a listing of valid arguments.

rna-seq • 196 views
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by cesarihv70

lot of thanks SMK and ricket.woo,I think that I have to read the org.Sc.sgd.db manual becaues by can see is a basic problem, lot of thanks. However, in the column(org.Sc.sgd.db) I can't see the transcript lenght option. Can you help me to get the transcript length?

ADD REPLYlink written 4 weeks ago by cesarihv70
3
gravatar for SMK
4 weeks ago by
SMK1.8k
Ghent, Belgium
SMK1.8k wrote:

Hi cesarihv7,

I think you can use:

select(
  sc,
  keys = my.simbols,
  columns = c("ENTREZID", "GENENAME", "SGD")
)

Which returns:

> select(
+   sc,
+   keys = my.simbols,
+   columns = c("ENTREZID", "GENENAME", "SGD")
+ )
'select()' returned 1:1 mapping between keys and columns
       ORF ENTREZID GENENAME        SGD
1  YAL002W   851261     VPS8 S000000002
2  YAL003W   851260     EFB1 S000000003
3  YAL004W     <NA>     <NA> S000002136
4  YAL005C   851259     SSA1 S000000004
5  YAL007C   851226     ERP2 S000000005
6  YAL008W   851225    FUN14 S000000006
7  YAL009W   851224     SPO7 S000000007
8  YAL010C   851223    MDM10 S000000008
9   ETS1-1  9164941   ETS1-1 S000029717
10  ETS1-2  9164933   ETS1-2 S000029707
11  ETS2-1  9164936   ETS2-1 S000029718
12  ETS2-2  9164942   ETS2-2 S000029713
13    HRA1  9164866     HRA1 S000119380
14    ICR1  9164906     ICR1 S000132612
15    IRT1 23547381     IRT1 S000178119
16  ITS1-1  9164938   ITS1-1 S000029715

And you can use columns(org.Sc.sgd.db) to check what fields are available.

ADD COMMENTlink written 4 weeks ago by SMK1.8k

lot of thanks SMK ,I think that I have to read the org.Sc.sgd.db manual becaues by can see is a basic problem, lot of thanks. However, in the column(org.Sc.sgd.db) I can't see the transcript lenght option. Can you help me to get the transcript length?

ADD REPLYlink written 4 weeks ago by cesarihv70
1

You can try biomaRt:

> library("biomaRt")
> ensembl <- useMart("ensembl", dataset = "scerevisiae_gene_ensembl")
> getBM(
+   attributes = c("ensembl_gene_id", "transcript_length", "external_gene_name", "entrezgene"),
+   filters = "ensembl_gene_id",
+   values = genes,
+   mart = ensembl
+ )
   ensembl_gene_id transcript_length external_gene_name entrezgene
1           ETS1-1               700                            NA
2           ETS1-2               700                            NA
3           ETS2-1               211                            NA
4           ETS2-2               211                            NA
5             HRA1               564                            NA
6             ICR1              3199                            NA
7             IRT1              1489                            NA
8           ITS1-1               361                            NA
9          YAL002W              3825               VPS8     851261
10         YAL003W               621               EFB1     851260
11         YAL004W               648                            NA
12         YAL005C              1929               SSA1     851259
13         YAL007C               648                        851226
14         YAL008W               597                        851225
15         YAL009W               780                        851224
16         YAL010C              1482              MDM10     851223

Available attributes can be shown using listAttributes(ensembl).

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by SMK1.8k

hi again, when I closed session in R I tried the command again

select(sc, keys = my.simbols, columns = c("ENTREZID", "GENENAME", "GO", "PATH"))

and got the following result: 'select()' returned 1:many mapping between keys and columns

and instead of having 7000 genes with this result, I have more than 100,000 and that is incorrect

help please

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by cesarihv70
0
gravatar for ricket.woo
4 weeks ago by
ricket.woo0
China
ricket.woo0 wrote:

You can type: keytypes(sc) and columns(sc) to know what kind of information you can get from this AnnotationDb object.

ADD COMMENTlink written 4 weeks ago by ricket.woo0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 907 users visited in the last hour