Getting plant full length cDNA sequences from NCBI
1
0
Entering edit mode
7.3 years ago
abhijit.synl ▴ 60

Hello, I wanted to know if there was a method to download plant full length cDNA sequences from NCBI on a periodic basis. Lets say once every month. The way I am doing now is typing FLI_CDNA OR full-length OR "full length" in the search textbox, going to the Nucleotide section and selecting plant and mRNA as filters. After that I do some post-processing to select unique sequences. Works well. But I want an automated unix method to do this on a periodic basis. I do not know if such lists can be found on the ncbi ftp site.

Looking for a solution.

Thanks Abhijit

cDNA BLAST NCBI Update • 1.8k views
ADD COMMENT
1
Entering edit mode
7.3 years ago
apa@stowers ▴ 600

I do this regularly, using NCBI eutils. I run them via Perl (manual here) but there is also a command-line toolset (manual here)

This is a minimal example that downloads complete nucleotide fastas for Mnemiopsis; you will have to modify the $esearch URL for your needs:

#!/usr/bin/env perl
use LWP::Simple;
use strict;

my $retmax = 500;  # records per batch
## Entrez query string: Mnemiopsis[ORGN] AND complete[Title] NOT partial[Title] NOT genome[Title]
my $esearch = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?"
    . "db=nuccore&usehistory=y&term=Mnemiopsis%5BORGN%5D+AND+"
    . "complete%5BTitle%5D+NOT+partial%5BTitle%5D+NOT+genome%5BTitle%5D";
my $eresult = get($esearch);
my ($N, $key, $web) = ($eresult =~ m|<Count>(\d+)</Count>.*<QueryKey>(\d+)</QueryKey>.*<WebEnv>(\S+)</WebEnv>|s);
my $efetch = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?"
    . "db=nuccore&WebEnv=$web&query_key=$key&rettype=fasta&retmode=text";
my $nbatch = int($N/$retmax);
$nbatch++ if $N/$retmax > $nbatch;
my $retstart = 0-$retmax;
foreach my $i (1..$nbatch) {
    sleep 1;  # slow down server hit rate
    $retstart += $retmax;
    print STDERR "Batch $i/$nbatch\n";
    my $efetch1 = "$efetch&retstart=$retstart&retmax=$retmax";
    my $efetch1_result = get($efetch1);
    print $efetch1_result;
}
ADD COMMENT

Login before adding your answer.

Traffic: 2740 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6