Several years ago, I was involved with a project to detect putative archaeal sequences in human sequence data. Unfortunately, as is often the case in academia, the database described in the publication was not maintained when I moved to a new job.
This type of study has perhaps been superseded by the Human Microbiome Project. However, I'm still interested in methods to detect potentially-interesting "contaminant" sequences in public databases of sequences that are (supposedly) from one organism.
Originally, we used BLAT to search the human EST database using archaeal genes (from complete genome sequences) as the query. My questions are:
- Was using these 2 databases the best approach, or are there better sources?
- Would you use BLAT today for this type of task, or a different tool?