I'm looking to pull information from data made available on the NCBI site. So far I've made use of the geneinfo and gene2accession datasets from ftp://ftp.ncbi.nih.gov/gene. So I've got GeneIDs, and accession versions/gi's for the nucleotide, mRNA and protein sequences associated with the geneID. The actual sequences I could get from gene2refseq but is there any way I could get just the lengths of the various transcripts?
I can't use Entrez, I need a copy of the raw data.