Question: How to retrieve Blast nr database dimensions?
0
gravatar for Macspider
21 months ago by
Macspider2.8k
Vienna - BOKU
Macspider2.8k wrote:

Hi guys, sorry if I look dumb (I do feel that way now).

I am going through info pages, citations and database metrics of Blast since this morning because I need to find these infos:

  • Number of sequences in the non-redundant protein database (nr)
  • Number of sequences in the non-redundant nucleotide database (nt)

I am a little desperate. I don't think that the number of sequences in NCBI-Genbank and the number of sequences in NCBI-Protein are the same as in nr and nt, because they're probably redundant. Am I wrong? Do you know where could I get such numbers?

EDIT: I've been suggested a workaround elsewhere so I post it for future readers. The database folder contains the information in a file.

  • nt.nal for the nucleotide nr
  • nr.pal for the protein nr
ADD COMMENTlink modified 21 months ago by Sej Modha4.1k • written 21 months ago by Macspider2.8k
5
gravatar for Sej Modha
21 months ago by
Sej Modha4.1k
Glasgow, UK
Sej Modha4.1k wrote:

nt.nal and nr.pal files contains info about the number of sequences (NSEQ). Details from these files for our local BLAST database version:

#
# Alias file created 06/24/2017 23:11:41
#
TITLE Nucleotide collection (nt)
DBLIST "nt.00" "nt.01" "nt.02" "nt.03" "nt.04" "nt.05" "nt.06" "nt.07" "nt.08" "nt.09" "nt.10" "nt.11" "nt.12" "nt.13" "nt.14" "nt.15" "nt.16" "nt.17" "nt.18" "nt.19" "nt.20" "nt.21" "nt.22" "nt.23" "nt.24" "nt.25" "nt.26" "nt.27" "nt.28" "nt.29" "nt.30" "nt.31" "nt.32" "nt.33" "nt.34" "nt.35" "nt.36" "nt.37" "nt.38" "nt.39" "nt.40" "nt.41" "nt.42" "nt.43" "nt.44" "nt.45" "nt.46" "nt.47"
NSEQ 43107468
LENGTH 148603982942
#
# Alias file created 06/09/2017 09:55:12
#
TITLE All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects
DBLIST "nr.00" "nr.01" "nr.02" "nr.03" "nr.04" "nr.05" "nr.06" "nr.07" "nr.08" "nr.09" "nr.10" "nr.11" "nr.12" "nr.13" "nr.14" "nr.15" "nr.16" "nr.17" "nr.18" "nr.19" "nr.20" "nr.21" "nr.22" "nr.23" "nr.24" "nr.25" "nr.26" "nr.27" "nr.28" "nr.29" "nr.30" "nr.31" "nr.32" "nr.33" "nr.34" "nr.35" "nr.36" "nr.37" "nr.38" "nr.39" "nr.40" "nr.41" "nr.42" "nr.43" "nr.44" "nr.45" "nr.46" "nr.47" "nr.48" "nr.49" "nr.50" "nr.51" "nr.52" "nr.53" "nr.54" "nr.55" "nr.56" "nr.57" "nr.58" "nr.59" "nr.60" "nr.61" "nr.62" "nr.63" "nr.64" "nr.65" "nr.66" "nr.67" "nr.68"
NSEQ 124946261
LENGTH 45844202229
ADD COMMENTlink modified 21 months ago • written 21 months ago by Sej Modha4.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 763 users visited in the last hour