I'm looking for a way to extract the nucleotide sequences of NCBI GenBank records corresponding to specific annotated regions in the associated NCBI GenPept records (either manually, or ideally, programmatically using R package
rentrez, FASTA format).
For example, this spike protein sequence has two regions annotated, corresponding to the S1 and S2 glycoproteins, that can be easily highlighted or isolated. But the corresponding nucleotide sequence GenBank entry doesn't feature that annotated region information, giving only the nucleotide sequence of the whole protein. Is there a way of cross-referencing these to only isolate the relevant sequence?