Question: Get the 16s rRNA gene from NCBI
0
gravatar for kelvinfrog75
3.1 years ago by
kelvinfrog7510
kelvinfrog7510 wrote:

Hey, I have a list of NCBI ID and gi number and they are the ID/number for the whole bacteria gene. I want to build a fasta file with only the 16s rRNA gene from the bacteria associated these ID and gi#. Does anyone know if there is a quick way of doing it? Is there a search that you can just enter the whole bacteria genome ID and then specify the 16srRNA gene. Alternatively, does anyone know the primer sequences that can extract the 16s gene? if so, I guess I can run the PCR simulation to extract out the 16S region from the whole genome.

gene • 4.2k views
ADD COMMENTlink modified 3.1 years ago by natasha.sernova3.5k • written 3.1 years ago by kelvinfrog7510
0
gravatar for natasha.sernova
3.1 years ago by
natasha.sernova3.5k
natasha.sernova3.5k wrote:

You have a list of ids and gis, am I right?

Save the list to the text-file, 16s-sRNAs.txt, each id in a new line,

Go to Batch Entrez:

Read the text on the page, it’s important.

http://www.ncbi.nlm.nih.gov/sites/batchentrez

Then:

1) select a correct nucleotide database in the upper left corner of the page. The text on the page explains the difference.

2) select your File in the middle of the menu - 16s-sRNAs.txt from your computer.

3) go to Retrieve - right side of the manu.

In some time you will see what nucleotide sequences NCBI has for this list of IDs.

You can make a database out of the sequences with makeblastdb.

See the following papers, if you need additional information:

The variability of the 16S rRNA gene in bacterial genomes and its consequences for bacterial community analyses.

http://www.ncbi.nlm.nih.gov/pubmed/23460914 , 2013

Then and now: use of 16S rDNA gene sequencing for bacterial identification and discovery of novel bacteria in clinical microbiology laboratories.

http://www.ncbi.nlm.nih.gov/pubmed/18828852 ,2008

16S rRNA Gene Sequencing for Bacterial Identification in the Diagnostic Laboratory: Pluses, Perils, and Pitfalls

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2045242/ , 2007

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by natasha.sernova3.5k

I have the GI number for the whole genome but not the 16s. So I am not sure if the method you describe would work unless it can identify the 16s by inputing the 16s command.

ADD REPLYlink written 3.1 years ago by kelvinfrog7510

That would be one reason to use the method I outlined below.

ADD REPLYlink written 3.1 years ago by genomax69k

Hey, I tried your method but I keep getting "Segmentation fault: 11" when I run blastdbcmd. Do you know what can cause that? thanks.

ADD REPLYlink written 3.1 years ago by kelvinfrog7510

Did you get the correct executable for your OS? How much memory do you have?

ADD REPLYlink written 3.1 years ago by genomax69k
0
gravatar for genomax
3.1 years ago by
genomax69k
United States
genomax69k wrote:

NCBI has a 16s microbial blast database. It is available here. Get that file. Then use blastdbcmd from blast+ package to retrieve the sequences you need. Look into the -entry_batch option where you can provide the gi # (for now this will work but gi's are going away in September 2016).

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by genomax69k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1355 users visited in the last hour