Accessing COSMIC database via Python script
1
0
Entering edit mode
24 months ago

Hi everyone!

I have a list of VEP somatic variants related to cancer data and I would like to query the COSMIC database by HGSVc name in order to programmatically retrieve the FATHMM prediction and tumour origin via Python script. Hi have several questions about it and I´m open to any kind of suggestion: 1) Which is the best way to access the database? By downloading the COSMIC data or is there an API? 2) In the first case, which download file would be best for my purpose? (I´ve already downloaded the "CosmicMutantExport.tsv" file but I´m not sure this is the most appropriate) 3) How to access this huge file? Do you know if there is a Python library ?

Thank you in advance.

cosmic VEP python annotation variant • 1.1k views
ADD COMMENT
1
Entering edit mode

Do you need to do it via python? Is something like this enough?

ADD REPLY
0
Entering edit mode

That´s good! Thank you for posting it, iraun! Yes, it could be an option!

ADD REPLY
1
Entering edit mode
24 months ago
LauferVA 4.2k

1a) Which is the best way to access the database, by downloading the COSMIC data

In computer programming, people go to great lengths to engineer datasets and databases to optimize for different things. For example, one could optimize for a solution that runs in a minimum time. Or, one could optimize a solution that minimizes the amount of memory used at any one time. If we get down to brass tacks, I cannot really answer this question without knowing more about what you wish to optimize for, and what you have access to. For example, do you have free access to a huge compute cluster, or a laptop from 2006? The answer to 2) and 3) is similarly difficult to provide.

1b) or is there an API?

This, on the other hand, can definitely be answered. Yes, there is an API.. Try navigating here first, though.

ADD COMMENT
0
Entering edit mode

Thank you for your reply Vincent!

For example, do you have free access to a huge compute cluster, or a laptop from 2006?

To answer your question, I'm working in a server and I don't have problems with space limitations at the moment.
I have implemented a pipeline with the aim to annotate cancer variants by retrieving information from different databases, such as ClinVar, dbSNP and so on. For most of these, I leveraged APIs that allowed me to retrieve the information I was interested in a minimum time.

I would like to be able to do the same thing for COSMIC as well.

This, on the other hand, can definitely be answered. Yes, there is an API.. Try navigating here first, though.

Thanks for your suggestion, however I don't think this API works for me because doesn´t give prediction information and the tissue of origin of the tumor.

ADD REPLY

Login before adding your answer.

Traffic: 1968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6