Question: Triple Negative Breast Cancer: Undergraduate project help? Finding publicly accessible data.
I am enrolled in an undergraduate computational biology course and I am struggling with the task of locating publicly accessible data for a course project. Ideally, from human patients / participants with control (unaffected) and experimental (affected) tissue sequences. I have been exploring The Cancer Genome Atlas, but I find the interface confusing in that I do not know where, or indeed if, sequence data in a format I understand, e.g., FASTA, can be found. I am required to determine the secondary RNA structures, e.g., alpha pleated sheets, beta barrels, etc. and PSIPRED requires that all submissions be amino acids. I am familiar with both Python and R so I could quickly learn how to use a specific package if doing so would facilitate the completion of a noteworthy project. Please note: Although, I mentioned a specific form of breast cancer in the title, triple negative, I am perfectly willing to change the major focus to another form of cancer as long as sequence data can be readily obtained. I also endeavor to obtain microarray data for the subsequent construction of a heat map illustrating differing levels of gene expression.

R microarray cancer python • 774 views
ADD COMMENTlink modified 2.0 years ago • written 2.1 years ago by Caitlin90

Have a look at OncoTrack and/or the Etriks portal. It may be worth having a look at the Open Targets Platform too. These are the targets associated with triple-negative breast cancer based on our latest release. Try searching for other cancers as well. The association is made based on differential expression (from microarray and RNASeq experiments) and other data sources. We also show if the expression is up or downregulated in patients x controls (look for 'increased' under the 'Activity' column in this table showing the evidence for BRIP1-breast carcinoma association. Just be aware that the Platform does not store patient data though. This data comes from Expression Atlas and we link the study back to the original database.

ADD REPLYlink written 2.1 years ago by Denise - Open Targets4.9k
Jake Warner690
Perhaps easier than wading through the sequencing archives would be to find a recent study, downloading their data, and then performing your analyses. For example putting "triple negative breast cancer sequencing" into pub med turned up this:

You can find their data in the "data availability" box. This is just one example but I'm sure you can find a study and dataset that is close to what you're looking for!

ADD COMMENTlink written 2.1 years ago by Jake Warner690
