Transcription Factor Matrix Id To Gene Symbols
1
1
Entering edit mode
10.1 years ago
Min ▴ 20

Hi,

I have some problems with mapping transcription factor matrix id (from TRANSFAC and JASPAR) to gene symbols.

(eg.

V$MAFB_01 -> gene symbols?

V$CREBP1_Q2 -> gene symbols?

V$HNF1_02 -> gene symbols?

)

What I ultimately wanna do is checking the expression values of interest transcription factor in expression data.

My expression data identifies gene by gene symbols.

Previously, I obtained interest transcription factors matrices that show significant probability of binding given sequence.

But, I have no idea what to do for mapping those matrices to gene set.

I have found site below, but it provide only cancer-related TF families. http://rulai.cshl.edu/TRED/TFlist.htm

Could you provide any relevant info about these issues?

Thanks a lot!

transcription matrix gene • 6.2k views
ADD COMMENT
0
Entering edit mode
10.1 years ago
cnluzon ▴ 30

Hi,

If you are looking for the official gene symbols, you can take a look here at the HUGO Gene Nomenclature Commitee http://www.genenames.org/ The TRANSFAC field you mention is the ID. Although most of the time the part after the '$' matches the gene symbol, I think it is more appropriate if you try to search by the 'NA' field, which is the name of the protein, which usually should match the Gene name.

In the HGNC they have a tool called "symbol-checker" http://www.genenames.org/help/symbol-checker that can take a list of symbols and returns the official ones telling you how they match. For instance, I tried MAFB and I got this:

MAFB Approved symbol MAFB v-maf avian musculoaponeurotic fibrosarcoma oncogene homolog B HGNC:6408 20q11.1-q13.1

you can even upload a file with all the symbols you want to check so that might solve your problem. You could do the same with Jaspar.

Hope this helped.

ADD COMMENT
0
Entering edit mode

Thanks for your comment.

I understand your answer except what the 'NA' field means.

Is there any option like 'NA' field when I put a query to symbol-checker?

Thanks again.

ADD REPLY
0
Entering edit mode

Sorry, maybe I did not explain myself correctly. What I meant with the 'NA' field is that for each matrix in TRANSFAC you find several fields (each of them is identified by a two-letter code: ID, NA, AC... You can see all of them in the TRANSFAC specification: http://www.gene-regulation.com/pub/databases/transfac/doc/matrix1.html).

So if you use the NA field for each TRANSFAC matrix instead of the ID, you will have a more accurate query in the symbol checker web.

ADD REPLY
0
Entering edit mode

Oh thanks I got it. thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 1724 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6