Question: Transcription Factor Matrix Id To Gene Symbols
gravatar for Min
5.8 years ago by
Min20 wrote:


I have some problems with mapping transcription factor matrix id (from TRANSFAC and JASPAR) to gene symbols.


V$MAFB_01 -> gene symbols?

V$CREBP1_Q2 -> gene symbols?

V$HNF1_02 -> gene symbols?


What I ultimately wanna do is checking the expression values of interest transcription factor in expression data.

My expression data identifies gene by gene symbols.

Previously, I obtained interest transcription factors matrices that show significant probability of binding given sequence.

But, I have no idea what to do for mapping those matrices to gene set.

I have found site below, but it provide only cancer-related TF families.

Could you provide any relevant info about these issues?

Thanks a lot!

gene matrix transcription • 3.8k views
ADD COMMENTlink modified 5.8 years ago by cnluzon30 • written 5.8 years ago by Min20
gravatar for cnluzon
5.8 years ago by
cnluzon30 wrote:


If you are looking for the official gene symbols, you can take a look here at the HUGO Gene Nomenclature Commitee The TRANSFAC field you mention is the ID. Although most of the time the part after the '$' matches the gene symbol, I think it is more appropriate if you try to search by the 'NA' field, which is the name of the protein, which usually should match the Gene name.

In the HGNC they have a tool called "symbol-checker" that can take a list of symbols and returns the official ones telling you how they match. For instance, I tried MAFB and I got this:

MAFB Approved symbol MAFB v-maf avian musculoaponeurotic fibrosarcoma oncogene homolog B HGNC:6408 20q11.1-q13.1

you can even upload a file with all the symbols you want to check so that might solve your problem. You could do the same with Jaspar.

Hope this helped.

ADD COMMENTlink modified 5.8 years ago • written 5.8 years ago by cnluzon30

Thanks for your comment.

I understand your answer except what the 'NA' field means.

Is there any option like 'NA' field when I put a query to symbol-checker?

Thanks again.

ADD REPLYlink written 5.8 years ago by Min20

Sorry, maybe I did not explain myself correctly. What I meant with the 'NA' field is that for each matrix in TRANSFAC you find several fields (each of them is identified by a two-letter code: ID, NA, AC... You can see all of them in the TRANSFAC specification:

So if you use the NA field for each TRANSFAC matrix instead of the ID, you will have a more accurate query in the symbol checker web.

ADD REPLYlink written 5.8 years ago by cnluzon30

Oh thanks I got it. thank you very much.

ADD REPLYlink written 5.8 years ago by Min20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1704 users visited in the last hour