Question: Outputing gexpr() with gene_name (or gene symbol) instead of MSTRG.x gene_id
gravatar for ever_wudi
10 days ago by
ever_wudi10 wrote:

Hi, I am trying to use Ballgown to output gene-sample expression matrix. What I did is geneexp = gexpr(bg), then write.csv(geneexp, "output.csv", row.names = TRUE). However, I could only get it output matrix with MSTRG.x gene ids as identifiers. How can I output the matrix with gene_name (or gene symbol) as identifiers (since MSTRG.x ids really have no use for me)?

Thanks! Di

ADD COMMENTlink modified 7 days ago • written 10 days ago by ever_wudi10
gravatar for ever_wudi
7 days ago by
ever_wudi10 wrote:

I figured out one way to do it. I used whole_tx_table = texpr(my.humandata, 'all') to extract everything into whole_tx_table then do final_fpkm_table = whole_tx_table[c("gene_name","sample

1","sample 2", ..)] to slice out only the gene_name and fpkm values, then write final_fpkm_table to a .cvs table. However, one problem I found in the final_fpkm_table.cvs table is that the

gene_names are not unique, there can be many rows for the gene 'Btf3l4' like below. What should I do with these values? Should I take sum, average, or max on the duplicate values to generate unique

gene_name-expression matrix? Also, can EdgeR,, or RSEM be used to generate unique gene_name-expression matrix?

Thanks for any advice.

        Sample 1    Sample 2    Sample 3    Sample 4
Btf3l4  7.267802    7.386622    9.815619    9.739746
Btf3l4  0.941536    1.256349    1.365669    1.3953
Btf3l4  0.897259    0.718018    0.025479    0.168297
Btf3l4  0.823937    0.744246    1.132339    1.020087
Btf3l4  0.42134 0.351375    0.236908    0.517893
Btf3l4  1.219011    1.331794    2.030579    1.207322
ADD COMMENTlink written 7 days ago by ever_wudi10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 657 users visited in the last hour