DESeq2 input from GDAC firehose
0
0
Entering edit mode
4 months ago
JB lee • 0

Hi guys, I hope you are fine.

I'm not good in English so if you couldn't understand my question, please feel free to reply. I'm a beginner of bioinformatics. I want to practice differential expressed gene (DEG) analysis in R.

The RNA seq data I used was downloaded from broad GDAC firehose. There are two types of not normalized data, one is "illuminahiseq_rnaseq-gene_expression (MD5)", and another one is "illuminahiseq_rnaseqv2-RSEM_genes (MD5)". (I decided to download these two because it prefers not normalized data as far as I know)

These two fires have raw count column, but the value of them are different. I wonder A's raw_counts is a real raw count and B's raw_counts is an expected counts by RSEM.

Below are the some rows of each data.

A) illuminahiseq_rnaseq-gene_expression (MD5)

gene raw_counts median_length_normalized RPKM AADACL3|126767_calculated 36 0.6686 0.0539

B) illuminahiseq_rnaseqv2-RSEM_genes (MD5) gene_id raw_count scaled_estimate transcript_id A1BG|1 247.2 2.27E-06 uc002qsd.3,uc002qsf.1 A1CF|29974 0 0 uc001jjh.2,uc001jji.2,uc001jjj.2,uc001jjk.1,uc009xov.2,uc010qhn.1,uc010qho.1 AADACL3|126767 16 7.38E-08 uc001aug.1,uc009vnn.1

What kinds of data do you prefer to use?

I guess I should use DESeqDataSetFromTximport() with RSEM raw counts, and DESeqDataSet() with another data. Is it right..?

Thank you.

DESeq2 counts raw • 135 views
ADD COMMENT

Login before adding your answer.

Traffic: 1336 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6