Clusters in Differential Gene Expression Data
1
0
Entering edit mode
5 months ago
Scofield • 0

Hi All,

I am new to the field of Bioinformatics, RNASeq etc. I managed to download differential gene expression data for a cohort of my study from GDC database. This data is having a column named as "cluster", and gene ids are occurring multiple times in different cluster numbers with different p values and log2FC values. I am having a hard time understanding what is this column-cluster representing in my dataset, and how should I consider them before I filter the data using thresholds for adjusted p value and log2FC? I am optimistically looking forward to a guidance from the experts of the field.

Thanks in advance.

gdc gene-expression seurat • 600 views
ADD COMMENT
0
Entering edit mode

can you show the name of the file and the first 10 lines of it?

ADD REPLY
0
Entering edit mode

biostars refused me to post answers as "not supported language"

ADD REPLY
0
Entering edit mode
5 months ago
rfran010 ★ 1.3k

I'm not familiar with the database, but maybe you figured it out already.

Generally, though, it depends on what the database defined as clusters. They should have some guidance in their documentation or literature. Might take some digging.

It sounds like maybe groups of the subjects are clustered and then gene expression is shown for those groups, explaining why gene Ids are repeated with different stats, eg. cluster 1 is skin cancer and cluster 2 is blood cancer, so they you can see Myc expression in each conditions. But really, I don't know.

ADD COMMENT

Login before adding your answer.

Traffic: 825 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6