Is it good idea to use two different quatification methods from TCGA at same time ?
Entering edit mode
9.9 years ago
jack ▴ 960

I want to get expression data from TCGA for the cancer of my interest around half of data are RNASeqv2 and the rest from RNASeqv.

This is from TCGA:

RNASeq Version 2 is similar to RNASeq in that it uses sequencing data to determine gene expression levels. RNASeq Version 2 uses a different set of algorithms to determine the expression levels are the results are presented in a slightly different set of files.

There are two analysis pipelines used to create Level 3 expression data from RNA Sequence data. The first approach used at TCGA relies on the RPKM method, while the second method uses MapSplice to do the alignment and RSEM to perform the quantitation

I want to use this data to build a regulatory network. My question is that, should I use just RNAsev or RNASeqV2 or I can mix all of them and use them in my model? What's the problem? What's the disadvantage of using both of them? (Some samples come from RNASeqv2 and others from RNASeq)

tcga RNA-Seq next-gen • 2.2k views
Entering edit mode
9.9 years ago

I would use the dataset that maximizes the sample size (which I would guess to be V2).

The isoform expression levels will vary if you use a different tool for mRNA quantification. The gene-level quantification should be more similar (and is what I would recommend using anyways), but it is best to avoid potential sources of bias if you can.

I would expect all old samples should be run with the latest pipeline. For example, I would check the publication data site to see what data is listed. For example, I only see V2 quantification for the latest publication:


Login before adding your answer.

Traffic: 2757 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6