proteome download, human gut microbes
0
0
Entering edit mode
7.5 years ago
Nitha ▴ 20

Hi All,

I have more than 250 number of human gut microbes name with their Taxonomy Id eg "Bacteroides stercoris ATCC 43183", downloading each of their whole protein FASTA files takes a lot of times. I used Ensembl Bacteria to download the particular bacteria's protein sequence, but it taking lots of time. Can anyone help me give some grep command line , or wget, or perl program to download the protein directly.

Eg: if we have accession number id of a gene, we can use it in Batch entrez to download n number fasta sequence for the list of IDs.

Thanks!

sequence • 1.6k views
ADD COMMENT
0
Entering edit mode

Are you interested in specific genes or entire gene complement of the genomes? If you know how to download the genomes files (search for threads here) then there is not much you can do about the time part. Depending on where you are in the world perhaps that is the best connect speed you are going to get.

If you have accession numbers and access to NCBI blast indexes you can use blastdbcmd utility from blast+ package to extract those sequences quickly like so: blastsbcmd -db /path_to/nr -entry_batch accession_number_file -out '%f' -out seqience_you_need

ADD REPLY
0
Entering edit mode

No, my question whether I can download "whole protein sequence" in FASTA format of particular bacteria using their name. Like eg: using gene id or accession number we can use BATCH ENTREZ to download n number of gene sequence or protein sequence.

ADD REPLY
0
Entering edit mode

You can use eutilities from NCBI. Take a look at the help doc here.

ADD REPLY

Login before adding your answer.

Traffic: 2893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6