Pacbio sequencing database
3
0
Entering edit mode
6.9 years ago
jerald • 0

Hi Y'all,

Im new at bioinformatics, and I was wondering if there was a quicker way to access whole genome (bacteria) files that was sequenced through pacbio?

or quicker ways to search whole genome sequences through the platforms that was used for their sequencing?

Can anyone help me?

Thanks in advance~^^

next-gen sequencing pacbio database whole genome • 2.5k views
ADD COMMENT
2
Entering edit mode
6.5 years ago
tjduncan ▴ 280

The NCBI and EMBL-ENA databases are great for all types of public access data.

Specifically to answer your question, the best and most comprehensive database for PacBio microbial data is the NCTC 3000 collection.

Per their website: "NCTC 3000 is a collaborative Whole Genome Sequencing (WGS) project that was established in 2013 between Public Health England (PHE), the Wellcome Trust Sanger Institute (WTSI) and Pacific Biosciences (PacBio). The project aims to generate 3000 high quality, closed reference genomes from strains within the NCTC collection.

Pacific Biosciences' Single Molecule, Real-Time (SMRT) Sequencing technology achieves very long reads and high consensus accuracy, greatly improving the ability to finish bacterial genomes. And, because the technology can directly detect base modifications, the epigenomes for bacteria can also be obtained with no additional data acquisition, and this data will also be provided."

As of September 2017 they are well on their on their way and have already provided 1700 reference quality genomes and assemblies for public health relevant bacterial strains.

You can browse their sequenced strains to obtain the raw PacBio data or the complete assemblies dependent on your use case for the data. You can browse the collection's data via the Public Health England website:

https://www.phe-culturecollections.org.uk/products/bacteria/nctc-3000-project-a-comprehensive-resource-of-bacterial-type-and-reference-genomes.aspx]

or Wellcome Trust Sanger Institute's website:

http://www.sanger.ac.uk/resources/downloads/bacteria/nctc/

ADD COMMENT
1
Entering edit mode
6.9 years ago

Here is a dataset of 54x coverage PacBio human genome sequencing.

ADD COMMENT
0
Entering edit mode

Hi~

thank you very much~ though i need a database for Bacteria, I should've specified it >_<"

ADD REPLY
1
Entering edit mode
6.9 years ago
GenoMax 141k

See this PacBio example datasets site. You can probably find more via searching EBI-ENA/SRA.

ADD COMMENT
0
Entering edit mode

Thank you for the link~ But what did you mean by "EBI-ENA/SRA"?

ADD REPLY
1
Entering edit mode

EBI-ENA is a database for all kinds of sequencing data. See this example for PacBio data (click on assembly link on left). SRA - is NCBI's high-throughput sequencing database. Data in these two databases (along with DDBJ) is automatically synced, so you can download data from any location that is convenient to you.

ADD REPLY
0
Entering edit mode

Wow! thank you so much~ didn't know that NCBI had that kind of sequencing database, it'll be a lot more easier searching now. :)

ADD REPLY

Login before adding your answer.

Traffic: 2455 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6