What is the difference between nr and refseq? Based on NCBI's own definition, "RefSeq database is a non-redundant set of reference standards derived from the INSDC databases that includes chromosomes, complete genomic molecules (organelle genomes, viruses, plasmids), intermediate assembled genomic contigs, curated genomic regions, mRNAs, RNAs, and proteins. ", refseq is also redundant. But when you perform blast searches, you can select either nr/nt or refseq. So I assume there is a difference.
Question: Difference between NCBI non-redundant and refseq database
4.3 years ago by
hdy • 100
hdy • 100 wrote:
ADD COMMENT • link •
4.3 years ago by
a.zielezinski ♦ 9.0k
a.zielezinski ♦ 9.0k wrote:
Nr database encompasses sequences from both, non-curated and curated databases:
Non-curated databases (low quality):
- GenBank/GenPept - unreviewed sequences submitted from individual laboratories and large-scale sequencing projects. Since these sequence records are owned by the original submitters and can not be altered, GenBank might contain many low quality sequences.
- trEMBL - unreviewed section of UniProt. This section contains a computer-annotated supplement of SwissProt that contains all the translations of EMBL nucleotide sequence entries not yet integrated in SwissProt
Curated databases (high quality):
- RefSeq - GenBank sequences that are manually curated by the NCBI staff. RefSeq records are owned by NCBI and can be updated as needed to maintain current annotation or to incorporate additional information.
- SwissProt - manually annotated and reviewed protein sequences
- PIR - non-redundant annotated protein sequence database
- PDB - experimentally-determined structures of proteins, nucleic acids, and complex assemblies
ADD COMMENT • link
Please log in to add an answer.
Powered by Biostar version 2.3.0
Traffic: 1722 users visited in the last hour