retrieving pubmed abstracts
1
0
Entering edit mode
7.3 years ago

hey, everybody. I have a text mining project and I have to write a program in R. I'm using the RISmed package to retrieve the articles abstract. I've got a problem here. I want to to use RISmed package or any other package in R in order to download all the PubMed's abstracts, not a particular abstract related to a particular keyword. Is there anyone to help me in this case? Thanks a lot.

R • 4.7k views
ADD COMMENT
1
Entering edit mode

Can you provide the pmid to RISmed package? Why do you need it to be in R? E-utilities would make short work of this if you had a list of pmids.

ADD REPLY
0
Entering edit mode

Yes, I can provide the PMID to RISmed package and download the respective abstract. But I want to retrieve all the PubMed's abstracts, not a particular one. And I also want to retrieve all these data by R because I should write a program for my text mining project and the first step is to retrieving all the abstracts from PubMed database.

ADD REPLY
2
Entering edit mode

You have to pull every PubMed abstract that's available?

ADD REPLY
0
Entering edit mode

Yes. There is should be a way. isn't there?

ADD REPLY
1
Entering edit mode

There appear to be a million citations per year (for last 4 years) so you are going to be looking at a huge download.

ADD REPLY
0
Entering edit mode

@genomax2 that's why I was concerned. @savermohm94, you should probably use Pierre's suggestion and go with baseline, and only fetch abstracts within a given time period, especially if all you need is POC for your project.

ADD REPLY
0
Entering edit mode

Hi, saber mohammadi I have some questions about text mining to ask you. Because my project does in the same way as you. Can I contact you via e-mail ?

Thank you so much.

ADD REPLY
0
Entering edit mode

Yes of course. I would be happy if I could give any help

ADD REPLY
3
Entering edit mode
7.3 years ago

ftp://ftp.ncbi.nlm.nih.gov/pubmed/baseline/

Baseline Data
-------------
NLM produces a baseline set of MEDLINE/PubMed citation records in XML format for download on an annual basis. The annual baseline is released in December of each year. The complete baseline consists of files medline17n0001 through medline17n0892. ftp://ftp.ncbi.nlm.nih.gov/pubmed/baseline
ADD COMMENT

Login before adding your answer.

Traffic: 1251 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6