My source is from the FANTOM consortium. They list some 2000 TFs for human. See Table S1, a list of human TFs, in Ravasi, et al. (2010 Cell 140: 744-752) that describe transcription factors. From the website: FANTOM has developed and expanded over time to encompass the fields of transcriptome analysis. The object of the project is moving steadily up the layers in the system of life, progressing thus from an understanding of the ‘elements’ - the transcripts - to an understanding of the ‘system’ - the transcriptional regulatory network.
DNABP is a database/manuscript, from late 2016, that built a machine learning method (Random Forest) to identify de-novo DNA-binding proteins using only sequence information: 1) the conservation of physiochemical protperties of the amino acids, and 2) the binding propensity of DNA-binding residues.
They divided 14,262 proteins from Uniprot for which they were confident if it was DNA-binding or non-DNA-binding and used this as their training data set; you can download this information from the supplement S1. You can also get DNA-binding and non-binding Uniprot accessions they used for their test set of their model from the supplements. Although the method achieved high accuracy (~83-90%) the web server system can only accept a single sequence at a time so it's not really suited for classifying a large number of de-novo DNA-binding proteins.
If anyone knows of a better/more-comprehensive resource available today I'd be happy if they could share it.