I went to the blast ftp database, there are 18 nt files, each is less than 800 MB, and for refseq_genome it has 83 files, most of which are larger than 800 MB, which means the refseq_genome is much larger than nt database. However, when I search the definition of nt on http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml, it says nt database include All GenBank + RefSeq Nucleotides + EMBL + DDBJ + PDB sequences (excluding HTGS0,1,2, EST, GSS, STS, PAT, WGS). No longer "non-redundant".
My question is:
- In my understanding RefSeq Nucleotides should include 
refseq_genomeandrefseq_rna, sorefseq_genomeshould be much smaller thanntdatabase. why isrefseq_genomealone is much larger than the wholentdatabase? - I tried one accession number 
NZ_AARG01000001.1from refseq bacteria genome, and blastn againstntandrefseq_genomedatabase. Forntcase, it took a few seconds and got less than 10 hits. Forrefseq_genomedatabase, it took more than 10 minutes and got more than 100 results (all the accession number began with NZ). Then I searched NZ and found NZ represent not completed project. So the difference betweenntandrefseq_genomeis that nt doesn't include NZ records? 
Hi, I just wonder how you get the information of
And also the number of bases? Thanks.
The summary information for the databases is from the NCBI's BLAST service, the database help ('?' icon next to the database selection) shows the details of the database. The information for the number of bases in the database comes from the summary information included in BLAST search results for each database, the location of this varies depending on the output format, on the NCBI's BLAST service this is available in the "Search Summary" section of the default HTML result.
Thank you. This helps me a lot.
I have follow-up questions. How will it be like if we draw a venn diagram to show the relationship among nt database, refseq genome sequences and refseq representative sequences? Thanks.