Hello,

I have a data in the form of a dataframe that I downloaded from GTEx Portal. It contains RNASeq gene read counts used in their study.

```
> dim(expr.df)
[1] 55993 2921
> expr.df[1:10,1:2]
GTEX-N7MS-0007-SM-2D7W1 GTEX-N7MS-0008-SM-4E3JI
ENSG00000223972 0 0
ENSG00000227232 158 166
ENSG00000243485 0 0
ENSG00000237613 0 0
ENSG00000268020 0 0
ENSG00000240361 0 0
ENSG00000186092 0 0
ENSG00000238009 17 2
ENSG00000233750 35 0
ENSG00000237683 8489 34
```

I checked the sample information file and there is no information about the conditions. I want to normalize the raw counts. For that, I want to use DESeq's `getVarianceStabilizedData()`

function. However this function takes as an input a `CountDataSet`

object. So when I try to make a `CountDataSet`

object using this:

```
> cds <- newCountDataSet(countData = as.matrix(expr.df))
Error in is(conditions, "matrix") :
argument "conditions" is missing, with no default
```

It spits out an error asking me to specify the conditions. However, there are no conditions in this dataset. How can I normalize these values?

I think you're getting into variance there. You just want to normalize for the number of reads sequenced, right?

I believe DESeq still uses median normalization.

I don't know the commands in DESeq but if you want to do it by hand here is the basic process:

Do a scatter plot of your two condition, i.e. GTEX-N7MS-0007-SM-2D7W1 on the X axis and GTEX-N7MS-0008-SM-4E3JI on the Y axis.

a=get the median count value in GTEX-N7MS-0007-SM-2D7W1

b=get the median count value in GTEX-N7MS-0008-SM-4E3JI

your slope, and median normalization factor is b/a

plot the line through your data (y intercept =0) and see if it fits.

Sometimes it works, sometimes you need to use something different.