Getting plant full length cDNA sequences from NCBI
Entering edit mode
6.1 years ago
abhijit.synl ▴ 60

Hello, I wanted to know if there was a method to download plant full length cDNA sequences from NCBI on a periodic basis. Lets say once every month. The way I am doing now is typing FLI_CDNA OR full-length OR "full length" in the search textbox, going to the Nucleotide section and selecting plant and mRNA as filters. After that I do some post-processing to select unique sequences. Works well. But I want an automated unix method to do this on a periodic basis. I do not know if such lists can be found on the ncbi ftp site.

Looking for a solution.

Thanks Abhijit

cDNA BLAST NCBI Update • 1.6k views
Entering edit mode
6.1 years ago
apa@stowers ▴ 580

I do this regularly, using NCBI eutils. I run them via Perl (manual here) but there is also a command-line toolset (manual here)

This is a minimal example that downloads complete nucleotide fastas for Mnemiopsis; you will have to modify the $esearch URL for your needs:

#!/usr/bin/env perl
use LWP::Simple;
use strict;

my $retmax = 500;  # records per batch
## Entrez query string: Mnemiopsis[ORGN] AND complete[Title] NOT partial[Title] NOT genome[Title]
my $esearch = ""
    . "db=nuccore&usehistory=y&term=Mnemiopsis%5BORGN%5D+AND+"
    . "complete%5BTitle%5D+NOT+partial%5BTitle%5D+NOT+genome%5BTitle%5D";
my $eresult = get($esearch);
my ($N, $key, $web) = ($eresult =~ m|<Count>(\d+)</Count>.*<QueryKey>(\d+)</QueryKey>.*<WebEnv>(\S+)</WebEnv>|s);
my $efetch = ""
    . "db=nuccore&WebEnv=$web&query_key=$key&rettype=fasta&retmode=text";
my $nbatch = int($N/$retmax);
$nbatch++ if $N/$retmax > $nbatch;
my $retstart = 0-$retmax;
foreach my $i (1..$nbatch) {
    sleep 1;  # slow down server hit rate
    $retstart += $retmax;
    print STDERR "Batch $i/$nbatch\n";
    my $efetch1 = "$efetch&retstart=$retstart&retmax=$retmax";
    my $efetch1_result = get($efetch1);
    print $efetch1_result;

Login before adding your answer.

Traffic: 1001 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6