Question: Obtaining list of specific TFs from Interpro tsv file.
0
gravatar for a.rex
16 months ago by
a.rex190
a.rex190 wrote:

I recently ran interpro on predicted ORFs >100aa. I then used the PFAM_DBD and SUEPRFAMILY_DBD database IDs with the hope of collecting TFs. Of course many genes have both a Homeobox hit (PF00046) as well as another hit such as PAX (PF00292).

My question is, how do people make a prediction for the number of TFs?

I simply took all the TF hits in the list and removed duplicates. Would this be valid for identifying total number of TFs?

But how can I account for specific families?

sequence interpro • 355 views
ADD COMMENTlink modified 16 months ago • written 16 months ago by a.rex190
1

what kind of number of TFs are you looking for: how many different types of TFs in the genome or how many genes are potentially a TF ?

ADD REPLYlink written 16 months ago by lieven.sterck6.7k

How many different types of TFs. Thanks

ADD REPLYlink written 16 months ago by a.rex190

the more difficult one thus ;)

sounds a reasonable approach. How do you deal with a case as you described (one gene, multiple hits)? And what exactly do you mean the "how can I account for specific families" ?

perhaps you might be better of in the end by first creating gene families and then annotate them family-wise based on the genes in the family.

Looking into literature might help as well. The exists quite some TF database resources and from the papers describing them you might get some ideas. Example : plantTFDB plnTFDB(check the citation section at the bottom of the page)

ADD REPLYlink written 16 months ago by lieven.sterck6.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1731 users visited in the last hour