Is there a blacklist for incredible/BS genbank entries to use for e.g. blast searches or metagenomics? There could be many questionable or hoax entries in there, or possibly such entries where the taxonomic classification of the sequence is incorrect?
I'll start with this one:
Used to establish the existence of "Stealth virus 1" as a new viral species and "Viteria". Run Blast on the sequence and check the background of the single author W. John Martin, M.D., Ph.D., and his dubious "Center for Complex Infectious Diseases". Edit: IMHO these sequences should be removed from genbank and the patents based on them nullified.
Edit: See also this: John Martin stripped of his license.
Recommending also this article: http://www.ncbi.nlm.nih.gov/pubmed/15924874 xD
From the second publication cited:
With this possible exception, the demonstration of a viral sequence followed by a bacterial sequence clone has yet to be documented. (http://www.ncbi.nlm.nih.gov/pubmed/10331959)
With other words: there is no hard evidence from the sequencing data for a joint occurrence of viral and bacterial sequences occurring jointly together in a single contig; on the other hand the contamination of cell-culture with bacteria or yeast is one of the most common accidents in the wet-lab, in particular if infected material is used in the beginning.
And still, the presence of "Stealth-viruses" and "Viteria" is based on this non-observation and complete sequences of obvious bacterial rDNA origin are annotated as being of viral origin.