How to verify the authenticity of the data set downloaded from database?
0
0
Entering edit mode
2.9 years ago
465336766 • 0

Recently I am analyzing a set of RNA-seq data and I was asked question that how to verify the authenticity of the data set downloaded from database? Is the data you get true? Do these data match their labels?

data analysis • 770 views
ADD COMMENT
0
Entering edit mode

Can you elaborate? What do you meaen with authenticity? Like md5sums? What is "true"?

ADD REPLY
0
Entering edit mode

I mean, for example, when you download data from the database that includes the treatment group and the control group, how do you know that it's really the treatment group and the control group and not just switching identities or other something? Thank u

ADD REPLY
1
Entering edit mode

how do you know that it's really the treatment group and the control group and not just switching identities

You don't. If there are biological markers e.g. a cancer vs normal and cancer is known to express certain genes and normal do not, then you can check for that, but metadata are usually what the authors provide you (or not). A proper analysis always includes some QC, e.g. PCA for RNA-seq to see whether the clustering indicates switch of labels, e.g. some normals clustering with the cancers and vice versa. This combined with individual gene expression checks might then tell you that something is odd and you could contact the authors (in case they respond). But this is all very custom, I doubt there is a simple automated procedure for things like that.

ADD REPLY
0
Entering edit mode

Thank u so much, Sir. Do I know what I probaly need to do.

ADD REPLY

Login before adding your answer.

Traffic: 2021 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6