Boxplot each row in a dataset in R?

Boxplot each row in a dataset in R?

0

Entering edit mode

5.7 years ago

bio94 ▴ 60

How do I boxplot each row in a dataset in R?

In the dataset below, I want to plot RF.CMS1.posteriorProb, RF.CMS2.posteriorProb, RF.CMS3.posteriorProb and RF.CMS4.posteriorProb for each GSM sample in column X. So separate boxplots for each row in column X, in R.

Appreciate any help in this regard.Many thanks.

    head(GSE14333_pheno_new)
          X Location DukesStage Age Gender DFSTime DFS_group DFSCens AdjXRT AdjCTX
1 GSM358387   Rectum          B  54      M    9.96      poor       0      Y      Y
2 GSM358392    Right          B  38      F   17.95      poor       1      N      Y
3 GSM358395    Right          B  78      F   22.02      poor       1      N      Y
4 GSM358396     Left          B  65      F   22.38      poor       0      Y      Y
5 GSM358397     Left          B  65      F   22.38      poor       0      Y      Y
6 GSM358399     Left          B  56      F   25.21      poor       0      Y      Y
  RF.CMS1.posteriorProb RF.CMS2.posteriorProb RF.CMS3.posteriorProb RF.CMS4.posteriorProb
1                  0.20                  0.34                  0.40                  0.06
2                  0.46                  0.06                  0.03                  0.45
3                  0.76                  0.02                  0.03                  0.19
4                  0.10                  0.78                  0.00                  0.12
5                  0.01                  0.95                  0.04                  0.00
6                  0.35                  0.42                  0.22                  0.01
  RF.nearestCMS RF.predictedCMS predict.label2 dist.to.template dist.to.cls1.rank  nominal.p
1          CMS3            <NA>         CRIS-B        0.7331209                68 0.00019996
2          CMS1            <NA>         CRIS-A        0.8965833                52 0.00739852
3          CMS1            CMS1         CRIS-B        0.8559375                80 0.00019996
4          CMS2            CMS2         CRIS-C        0.7944693               111 0.00019996
5          CMS2            CMS2         CRIS-C        0.8465627               120 0.00179964
6          CMS2            <NA>         CRIS-D        0.9366855               148 0.00719856
        BH.FDR Bonferroni.p
1 0.0006725928    0.0369926
2 0.0102143750    1.0000000
3 0.0006725928    0.0369926
4 0.0006725928    0.0369926
5 0.0026849469    0.3329334
6 0.0100130350    1.0000000

boxplot plot dataset R cancer • 7.9k views

ADD COMMENT • link updated 5.7 years ago by cpad0112 21k • written 5.7 years ago by bio94 ▴ 60

0

Entering edit mode

bio94 : If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.
Upvote|Bookmark|Accept

ADD REPLY • link 5.7 years ago by GenoMax 141k

3

Entering edit mode

5.7 years ago

Benn 8.3k

It depends on how many samples you have, if it will fit in your plot. But lets say you have only 6 samples like in your example, you could get a boxplot like this:

boxplot(t(GSE14333_pheno_new[,11:14]))

or

boxplot(t(GSE14333_pheno_new[1:6,11:14]))

ADD COMMENT • link 5.7 years ago by Benn 8.3k

3

Entering edit mode

adding to b.nota, add sample names: boxplot(t(GSE14333_pheno_new[,11:14]),names=c(GSE14333_pheno_new$X))

ADD REPLY • link 5.7 years ago by cpad0112 21k

3

Entering edit mode

5.7 years ago

zx8754 11k

Using ggplot, we need to convert from wide-to-long format, then plot, see example:

library(tidyverse)

# reproducible example data
set.seed(1); dat <- data.frame(X = paste0("sample", 1:6),
                               c1 = runif(6),
                               c2 = runif(6),
                               c3 = runif(6))

# convert wide-to-long format
plotDat <- gather(dat, key = "key", value = "value", -X)

# plot
ggplot(plotDat, aes(X, value)) +
  geom_boxplot()

ADD COMMENT • link 5.7 years ago by zx8754 11k

2

Entering edit mode

base plotting with long format above:

$ boxplot(value ~ X,  data=plotDat) # for plain boxplot
$ boxplot(value ~ X,  data=plotDat,col=rainbow(length(levels(plotDat$X)))) # add some colors

ADD REPLY • link 5.7 years ago by cpad0112 21k

2

Entering edit mode

5.7 years ago

marongiu.luigi ▴ 710

I usually work with LATTICE for this multivariate kind of analysis. I have rearranged the data as follows:

sample <- c(rep("GSM358387",6), rep("GSM358392",6), 
            rep("GSM358395",6), rep("GSM358396",6))
type <- c(rep(c("RF.CMS1.posteriorProb", "RF.CMS2.posteriorProb",
                "RF.CMS3.posteriorProb", "RF.CMS4.posteriorProb"),6))
response <- c(0.2, 0.46, 0.76, 0.1, 0.01, 0.35,
              0.34, 0.06, 0.02, 0.78, 0.95, 0.42,
              0.40, 0.03, 0.03, 0.00, 0.04, 0.22,
              0.06, 0.45, 0.19, 0.12, 0.00, 0.01)
X <- data.frame(sample, type, response)

> X
      sample                  type response
1  GSM358387 RF.CMS1.posteriorProb     0.20
2  GSM358387 RF.CMS2.posteriorProb     0.46
3  GSM358387 RF.CMS3.posteriorProb     0.76
4  GSM358387 RF.CMS4.posteriorProb     0.10
5  GSM358387 RF.CMS1.posteriorProb     0.01
6  GSM358387 RF.CMS2.posteriorProb     0.35
7  GSM358392 RF.CMS3.posteriorProb     0.34
8  GSM358392 RF.CMS4.posteriorProb     0.06
9  GSM358392 RF.CMS1.posteriorProb     0.02
10 GSM358392 RF.CMS2.posteriorProb     0.78
11 GSM358392 RF.CMS3.posteriorProb     0.95
12 GSM358392 RF.CMS4.posteriorProb     0.42
13 GSM358395 RF.CMS1.posteriorProb     0.40
14 GSM358395 RF.CMS2.posteriorProb     0.03
15 GSM358395 RF.CMS3.posteriorProb     0.03
16 GSM358395 RF.CMS4.posteriorProb     0.00
17 GSM358395 RF.CMS1.posteriorProb     0.04
18 GSM358395 RF.CMS2.posteriorProb     0.22
19 GSM358396 RF.CMS3.posteriorProb     0.06
20 GSM358396 RF.CMS4.posteriorProb     0.45
21 GSM358396 RF.CMS1.posteriorProb     0.19
22 GSM358396 RF.CMS2.posteriorProb     0.12
23 GSM358396 RF.CMS3.posteriorProb     0.00
24 GSM358396 RF.CMS4.posteriorProb     0.01

Then I used the bwplot funtion from Lattice:

library(lattice)
bwplot(
    sample ~ response|type,
    X,
    groups = type
)

and I got this: Plot I guess you can re-arrange the values and groups as you like playing around with the parameters, but I think this should do.

ADD COMMENT • link updated 5.7 years ago by GenoMax 141k • written 5.7 years ago by marongiu.luigi ▴ 710

0

Entering edit mode

Are you sure about this? It seems you divide data of 6 samples over 4 samples now...

You plot every "RF.CMSX.posteriorProb" separtely, but each sample has only one value for each, so 4 boxplots wouldn't make sense. I think OP wants one boxplot for all 4: RF.CMS1.posteriorProb-RF.CMS4.posteriorProb per sample.

ADD REPLY • link 5.7 years ago by Benn 8.3k

0

Entering edit mode

it might be how I have written down the dataframe: each sample has 6 entries but there are only 4 types of response. with this configuration:

sample <- c(rep(c("GSM358387",  "GSM358392",    
            "GSM358395",    "GSM358396"),6))
type <- c(rep(c("RF.CMS1.posteriorProb", "RF.CMS2.posteriorProb",
          "RF.CMS3.posteriorProb", "RF.CMS4.posteriorProb"),6))
response <- c(0.2,  0.46,   0.76,   0.1,    0.01,   0.35,
              0.34, 0.06,   0.02,   0.78,   0.95,   0.42,
              0.40, 0.03,   0.03,   0.00,   0.04,   0.22,
              0.06, 0.45,   0.19,   0.12,   0.00,   0.01)
X <- data.frame(sample, type, response)

library(lattice)
bwplot(
    sample ~ response|type,
    X,
    groups = type
)

there is a boxplot per sample: enter image description here Lattice facilitates the clustering of data. Changing the parameters allows to cluster the data to fit the demand.

ADD REPLY • link 5.7 years ago by marongiu.luigi ▴ 710

0

Entering edit mode

I agree that you make nice plots, but they are not correct. In OP's example we have 6 samples, each have 4 entries. But in your first example you have 4 samples, some samples have more entries than others... They are mixed up. In your second example You have 4 samples, each seem to have 6 entries of just one type. For example GSM358395 has only data for RF.CMS3.posteriorProb. I hope you understand what I am talking about...

ADD REPLY • link 5.7 years ago by Benn 8.3k

0

Entering edit mode

Sorry, I placed the dataframe to show how I built it since it was difficult to parse it in R. Now the dataframe I built is:

> X
      sample                  type response
1  GSM358387 RF.CMS1.posteriorProb     0.20
2  GSM358392 RF.CMS1.posteriorProb     0.46
3  GSM358395 RF.CMS1.posteriorProb     0.76
4  GSM358396 RF.CMS1.posteriorProb     0.10
5  GSM358397 RF.CMS1.posteriorProb     0.01
6  GSM358399 RF.CMS1.posteriorProb     0.35
7  GSM358387 RF.CMS2.posteriorProb     0.34
8  GSM358392 RF.CMS2.posteriorProb     0.06
9  GSM358395 RF.CMS2.posteriorProb     0.02
10 GSM358396 RF.CMS2.posteriorProb     0.78
11 GSM358397 RF.CMS2.posteriorProb     0.95
12 GSM358399 RF.CMS2.posteriorProb     0.42
13 GSM358387 RF.CMS3.posteriorProb     0.40
14 GSM358392 RF.CMS3.posteriorProb     0.03
15 GSM358395 RF.CMS3.posteriorProb     0.03
16 GSM358396 RF.CMS3.posteriorProb     0.00
17 GSM358397 RF.CMS3.posteriorProb     0.04
18 GSM358399 RF.CMS3.posteriorProb     0.22
19 GSM358387 RF.CMS4.posteriorProb     0.06
20 GSM358392 RF.CMS4.posteriorProb     0.45
21 GSM358395 RF.CMS4.posteriorProb     0.19
22 GSM358396 RF.CMS4.posteriorProb     0.12
23 GSM358397 RF.CMS4.posteriorProb     0.00
24 GSM358399 RF.CMS4.posteriorProb     0.01

In this figure, there are 6 samples with one entry for each of the 4 groups RF.CMSX.posteriorProb: enter image description here

ADD REPLY • link 5.7 years ago by marongiu.luigi ▴ 710

0

Entering edit mode

This looks more like it, but as you can see only 1 datapoint per entry per sample, so no boxes can be drawn (only a point with its mean the blue bar).

ADD REPLY • link 5.7 years ago by Benn 8.3k

0

Entering edit mode

that's because there is only one entry per sample per group. For instance, I read that GSM358387 has a single value of 0.20 for RF.CMS1.posteriorProb. With multiple entries per sample the boxes will grow correspondingly, as illustrated in the previous figures.

ADD REPLY • link 5.7 years ago by marongiu.luigi ▴ 710

0

Entering edit mode

I know, OP wanted all 4 in one box for each sample...

ADD REPLY • link 5.7 years ago by Benn 8.3k

1

Entering edit mode

In that case

bwplot(
    sample ~ response,
    X
)

will do that: enter image description here

ADD REPLY • link 5.7 years ago by marongiu.luigi ▴ 710

0

Entering edit mode

5.7 years ago

cpad0112 21k

@OP, if x-axis titles are not necessary, withapply function:

par(mfrow=c(1,nrow(GSE14333_pheno_new)))
apply(GSE14333_pheno_new[,c(11:14)],1,boxplot)

ADD COMMENT • link 5.7 years ago by cpad0112 21k

Login before adding your answer.

Similar Posts

Loading Similar Posts

Traffic: 2549 users visited in the last hour

Content Search
Users
Tags
Badges

Help About
FAQ

Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the

version 2.3.6