Pscan was recommended to do TF enrichment previously. RefSeq (for human, mouse, and drosophila, e.g. NM_000546) in one of the acceptable identifiers. I have a list of genes in a non-model organism, and want to find their RefSeq homologues in drosophila through BLAST. The problem is that I have to map my genes only to drosophila genes rather than other species in RefSeq. It seems that the genomic sequences from different species were packed together in NCBI RefSeq. Is there a separate RefSeq database for drosophila? Or can I carry out BLAST only against drosophila sequences in RefSeq database? Thanks!
I may be misunderstanding your question, but I think you are about to make a fatal mistake. Regulatory regions such as transcription factor (TF) binding sites change rapidly during evolution. If you are working on an organism different from Drosophila melanogaster, running Pscan on the orthologous genes from D. melanogaster will most likely give you completely misleading results.
I assume from what you write that the evolutionarily closest model organism to your organism of interest is D. melanogaster. I also assume that your organism of interest is not in Refseq (it would be nice if you would tell us which organism you work on so that we don't have to make blind assumptions). In that case, what you should do is to search the PSSMs that represent the sequence specificities of insect (D. melanogaster) TFs and search these against the actual promoter sequences of your organism of interest. You can do that for the PSSMs from Jaspar via this web form by inputting a FASTA file with the upstream sequences of the genes from your organism of interest.
A few vaguely remembered ideas. . .
60 million years is about the right evolutionary distance (eg. human vs. mouse or D. melanogaster vs D. pseudoobscura--as I recall, those two Drosophila species are as divergent as human and mouse) to compare transcription factor binding sites. A larger evolutionary distance is unlikely to produce anything useful. ET Dermitzakis has a nice series of papers on comparative genomics applied to transcription factor binding sites, starting with Mol Biol Evol. 2003 May;20(5):703-14: Tracing the evolutionary history of Drosophila regulatory regions with models that identify transcription factor binding sites.
I don't know where locusts are in the evolutionary scheme of things, but a quick PubMed search brought up: Insect Mol Biol. 2000 Dec;9(6):559-63: A locust type 1 ADP-ribosylation factor (lARF1)* is 100% identical in amino acid sequence to Drosophila ARF1 despite obvious DNA sequence divergence. Maybe a place to start.
Thank you, Lars. The insect I am working on is locust. I'd like to explain what I am thinking about this issue. I find a list of differentially expressed genes(DEGs) in my microarray when locusts were treated in hypoxia and want to know whether they share the same TFs. I had a assumption that locusts and fruitflies responded to hypoxia in a similar way, or they activitated homologous genes.This assumption is based on the fact that responses to hypoxia are very conservative in metazoans.Hypoxia-inducible factor 1, a major transcription factor regulating many genes in hypoxia, exists in all metazoan species that have been analyzed from C. elegans to H. sapiens(Semenza 2004). Based on this conservation, I want to see which homologous genes share TFs in fruitfly and guess that similar thing happened in locusts. I am not sure whether rapid evolution of TFBS will affect the final results. What I want to find out is which TFs take part in orchestrate the transcriptional responses in locusts. When the TFBS evolves, the TF binding domain will change accordingly, but the TF does not change. The TF acting on a gene in fruitfly will act on the homologous gene in locust. That's why I am trying to predict TF this way. Any argument is welcome!
A couple ideas in short form...
I agree with Lars' comments - spot on.
First, you could look to see what databases are offered at FlyBase, a portal to the Drosophila genomes (yes, several Dro speicies have been sequenced)
Second, if you're using locust, why not compare to Drosophila, honeybee and mosquito? In this manner you are likely to find the few (perhaps quite important) conserved motifs. There is always the caveat, as Lars indicates, that the regulation is not by the same TF.