Fetching Genbank records and parsing them to be DNAStringset objects
1
0
Entering edit mode
5.9 years ago
peter.durr • 0

Hi everyone

I am trying to develop a workflow in R that lets me fetch Genbank sequences and then undertakes multiple sequence alignments.

Where I am at, is that I can now use efetch{reutils} to successfully fetch a sequence as a character, "efetch" object.

RABV.fa <- efetch("EU293116", db="nucleotide", retmode="text", rettype="fasta")

the next step is to convert this into a DNAStringSet from the Biostrings library

RABV2.fa <- readDNAStringSet(RABV.fa)

However, I get the error:

> Error in .normarg_input_filepath(filepath) :    'filepath' must be a character vector with no NAs

I have played around with the options for readDNAStringSet - but without success.

Thanks

PS. This question revisits one asked in BioStars a few years ago - R: Parsing Fasta From Strings In R? - but the solution offered by Michael Dondrup no longer works (for me at least. An alternative was provided by Neilfws, but this produces a SeqFastadna file.

R Genbank XStringSet Biostrings • 1.9k views
ADD COMMENT
1
Entering edit mode
5.9 years ago
Gungor Budak ▴ 270

It seems readDNAStringSet requires a path to the FASTA file rather than directly working with the object returned by efetch. A workaround I can think of is to write the content of the object to a file then read it using readDNAStringSet as follows;

RABV.fa <- efetch("EU293116", db="nucleotide", retmode="text", rettype="fasta")
# here write the content to a file
write(RABV.fa$content, 'rabv.fa')
RABV2.fa <- readDNAStringSet('rabv.fa')

Or a temporary file can be used as also suggested in the linked post (note that destfile argument is now outfile);

tmp = tempfile()
RABV.fa <- efetch("EU293116", db="nucleotide", retmode="text", rettype="fasta", outfile=tmp)
RABV2.fa <- readDNAStringSet(tmp)
ADD COMMENT
0
Entering edit mode

Thanks Gungor

Both solutions worked - well done!

Regards

ADD REPLY

Login before adding your answer.

Traffic: 2350 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6