Question: Obtaining list of specific TFs from Interpro tsv file.
0
gravatar for a.rex
6 months ago by
a.rex180
a.rex180 wrote:

I recently ran interpro on predicted ORFs >100aa. I then used the PFAM_DBD and SUEPRFAMILY_DBD database IDs with the hope of collecting TFs. Of course many genes have both a Homeobox hit (PF00046) as well as another hit such as PAX (PF00292).

My question is, how do people make a prediction for the number of TFs?

I simply took all the TF hits in the list and removed duplicates. Would this be valid for identifying total number of TFs?

But how can I account for specific families?

sequence interpro • 222 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by a.rex180
1

what kind of number of TFs are you looking for: how many different types of TFs in the genome or how many genes are potentially a TF ?

ADD REPLYlink written 6 months ago by lieven.sterck4.5k

How many different types of TFs. Thanks

ADD REPLYlink written 6 months ago by a.rex180

the more difficult one thus ;)

sounds a reasonable approach. How do you deal with a case as you described (one gene, multiple hits)? And what exactly do you mean the "how can I account for specific families" ?

perhaps you might be better of in the end by first creating gene families and then annotate them family-wise based on the genes in the family.

Looking into literature might help as well. The exists quite some TF database resources and from the papers describing them you might get some ideas. Example : plantTFDB plnTFDB(check the citation section at the bottom of the page)

ADD REPLYlink written 6 months ago by lieven.sterck4.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 686 users visited in the last hour