Forum:Where to start? Newbie trying to enter the world of computational biology getting lost
3
2
Entering edit mode
8 months ago
qmarulfiz ▴ 60

Honestly, the past months have been overwhelm for me.

My professor in this first world country universty task me with a simple question for me to complete my PhD (due in 5 month).

"Look online for gene 'A' pathway network and come back with expression pattern of its associated genes in different type of diseases cells to identify important(hub?) proteins so we can compare with our lab mutated cell results and treatment"

And now I am totally lost how to start and what to do to proceed.

I only graduate from third world country university with minimal computational biology training. Im taking online R programming course now (I read you need to use BioConductor to analyse NGS data) but that seem to be steep learning curve for me (no worries, still climbing that hill and not going to give up yet). I read you need to use Cytoscape so you can use NGS data to make pathway. But how to download and filter and compare? And most tutorial on GSEA (this is for filtering the NGS data right?) seem to imply readers know where to get the initial inout data. And finally, what to "click" to download from NCBI Geo and what next after that?

If only somebody outhere willing to talk/guide me I will be very appreciate. Sorry for rambling.

Sincerely, Lost Soul

RNA-Seq R cytoscape Forum • 780 views
7
Entering edit mode
8 months ago
Mensur Dlakic ★ 13k

It is not a simple question at all, especially if you have no idea about this type of research. If I were you, I would say as much to my advisor. Last 6-12 months before PhD defense are stressful enough even without this kind of a request.

I took the following keywords from your description and plugged them into PubMed search field:

pathway network expression pattern associated genes disease hub proteins

The results are here. This should get you going in case you decide to stay with this project, and these search results may be helpful as well because Nature Methods has detailed description of all steps.

That said, it isn't something that can be properly explained in this type of communication. I think you would be best served by finding local help with this project.

4
Entering edit mode

+1 for

finding local help with this project

1
Entering edit mode

In the end I scourge the internet for my fellow countrymen that have some knowledge about this thing. I found another PhD student doing NGS stuff. Although he does not know how to do as my complicated questions, he did teach me on how to download RNA-seq dataset, useGalaxy for processing it and save the output (I did lots of short DNA sequencing for genotyping and that knowledge helps me to understand RNA-seq data processing).

He explain to me in my language which is very helpful. "Finding local help" is very useful indeed.

With my limited R knowledge, I able to extract foldchange and p-value from the output and inserted it into a sofware called PathVisio (based on easy to understand youtube tutorial) where I can atleast visualise the DEG using user made pathway. This is a start for me and its getting interesting with just small hiccup now.

The example dataset giving to me by my professor for comparison is from microArray.

So the truggle continue....

4
Entering edit mode
8 months ago

You will need to learn to program I suggest R or Python.

Start with simple code to get familiar with the language, a few weeks later start branching out into using modules and libraries that deal with gene ontologies, pathway analysis, network science.

As you make your way through the problems various ideas and solutions will pop into your head.

Instead of looking for a solution for a very complex question, you need to embark on a journey that has the destination that you seek.

Write code every day.

0
Entering edit mode

I use R now to access metadata from proteomic study and RNA-seq study and visualise it into plot.

Most of the proteomic study and RNA-seq study does not look at my gene/protein. But I can access my gene results from their meta-data using R.

To access and extract the information from others study meta-data really need coding knowledge (hence google R code = write code everyday)

4
Entering edit mode
8 months ago

The good news is that likely no or only minimal programming will be necessary (I guess good because you have to learn all this in such a short time, but programming remains a valuable skill)

For your specific question, I think genefriends would be a good place to start, but the site appears down for maintenance. There is a recent preprint. An alternative way to look for other genes that are relevant to gene A would be to use protein interaction databases, like STRINGS. If you want to figure out the expression of some gene in various cell types, have a look at GTEx.

1
Entering edit mode

I look at GTEx and the list of co-expressed genes and TF are very useful for me!