row number instead of gene name in DESeq2 result file
0
0
Entering edit mode
6.4 years ago
Bioinfonext ▴ 470

After DESeq2 run in results table I am getting row number instead of gene names, could you please suggest waht is the mistake:

> countMatrix = read.table("RPR_count.txt",header=T,sep='\t',check.names=F)

> dim(countMatrix)

[1] 32960    53

> colData <- read.csv("Metadata_RPR.csv", check.names=F)

> dim(colData)

[1] 52  5

> head (countMatrix)
                  HRPR_0D_V8_R1 LRPR_0D_V8_R1 HRPR_0D_V8_R2 LRPR_0D_V8_R2
1 ENSRNA049455608            27            15            13            20
2 ENSRNA049455955            50            31            66            20
3 ENSRNA049457359           149           325           146           203
4 ENSRNA049457570            52            25            30            22
5 ENSRNA049458758            16            19            44            32
6 ENSRNA049458986             6            14            17            20

> head(colData)
                Replication   Tissue Stage Genotype
1 HRPR_0D_V8_R1          R1 Seedling    0D     HRPR
2 LRPR_0D_V8_R1          R1 Seedling    0D     LRPR
3 HRPR_0D_V8_R2          R2 Seedling    0D     HRPR
4 LRPR_0D_V8_R2          R2 Seedling    0D     LRPR
5 HRPR_3D_V8_R1          R1 Seedling    3D     HRPR

# making first column of colData as rownames:

SampleInfo <- colData[,-1] 
rownames(SampleInfo) <- colData[,1]
dds <- DESeqDataSetFromMatrix(countData = countMatrix, colData = SampleInfo, design = ~ geno_stage)

in result file I am getting row number instead of gene names:

"","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj"
"1",2133.39832753799,-1.48791540218224,0.133582135187263,-11.1385807697743,8.14062484586051e-29,2.12348199104271e-24

"2",518.774465513072,22.415307417241,2.21105823446886,10.1378186552493,3.75388225192356e-24,4.8960009270713e-20
R rna-seq • 4.1k views
ADD COMMENT
2
Entering edit mode

When you read the data in you should have indicated that your rows have names in column 1 (e.g. row.names=1). Then instead of row numbers you should get gene names.

ADD REPLY
0
Entering edit mode

Also, ensure the number of fields in the first line is 1 less than the number of values in subsequent lines. That's how read.csv knows to use the first column as row names automatically.

> ?read.csv
..
..
  header: a logical value indicating whether the file contains the
           names of the variables as its first line.  If missing, the
           value is determined from the file format: ‘header’ is set to
           ‘TRUE’ if and only if the first row contains one fewer field
           than the number of columns.
..
..
..
row.names: a vector of row names.  This can be a vector giving the
           actual row names, or a single number giving the column of the
           table which contains the row names, or character string
           giving the name of the table column containing the row names.

           If there is a header and the first row contains one fewer
           field than the number of columns, the first column in the
           input is used for the row names.  Otherwise if ‘row.names’ is
           missing, the rows are numbered.
..
..
..
ADD REPLY
0
Entering edit mode

I did not understand it.

here I do have 5 column in csv file and its showing five column, it did not taking first column as rownames

colData <- read.csv("Metadata_RPR.csv", check.names=F)

> dim(colData)

[1] 52  5

> head (colData)
                Replication   Tissue Stage Genotype
1 HRPR_0D_V8_R1          R1 Seedling    0D     HRPR
2 LRPR_0D_V8_R1          R1 Seedling    0D     LRPR
3 HRPR_0D_V8_R2          R2 Seedling    0D     HRPR
4 LRPR_0D_V8_R2          R2 Seedling    0D     LRPR
5 HRPR_3D_V8_R1          R1 Seedling    3D     HRPR
6 LRPR_3D_V8_R1          R1 Seedling    3D     LRPR

Thanks

ADD REPLY
0
Entering edit mode

From the command line, run head -n 1 Metadata_RPR.csv and see if the first line has a "", (or some sort if invisible character in the beginning. That counts as a column and makes R read it as a not-rownames.

ADD REPLY
0
Entering edit mode

could you (Ram) please explain further, I just understand genomax point.

ADD REPLY
0
Entering edit mode

Your gene names are set as the rownames of your countMatrix, so, they should also appear in the results tables. Unfortunately, you have done neither of the following, which means that we are just hypothesising at what the problem could be:

  • you have not shown all code that you used for data processing
  • you have not shown a sample of the DESeq2 results table (within R)
  • you have not shown the command that you are using to write your data to disk
ADD REPLY
0
Entering edit mode

Thanks a lot. It is resolved.

ADD REPLY
0
Entering edit mode

Good work

ADD REPLY
0
Entering edit mode

What was the problem?

ADD REPLY
0
Entering edit mode

I just have to make 1st column in countMatix as rownames.

ADD REPLY
1
Entering edit mode

No, what was the problem in your initial approach? What did you have to correct to get it working? Right now, this post has suggestions on what could have been the problem but not a concrete pointer on what the actual problem was.

ADD REPLY

Login before adding your answer.

Traffic: 4054 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6