Question: Replace symbols in R table
1
gravatar for Inquisitive8995
20 months ago by
Inquisitive8995160 wrote:

Hello,

I have a table in R which consists of 70k rows and 37 columns. Lot of cells have "./." which I want to modify and make it as "ab" . I tried to use gsub() but it does not give me the required output.

I used :

file <- gsub("./.","ab",file)

I want the change to happen throughout the file. Is there any other way with which I can modify it? Thanks in advance.

Input: Eg:

S.no chr pos gene_name S1 S2 S3
1        1    1290    X            ./.   1/1  ./.
2        1     5822    Y           0/1   ./.   ./.

Output

S.no chr pos gene_name S1 S2 S3
    1        1    1290    X            ab   1/1  ab
    2        1     5822    Y           0/1   ab   ab

It can be either ab or NA

gsub R vcf • 672 views
ADD COMMENTlink modified 20 months ago by Santosh Anand5.1k • written 20 months ago by Inquisitive8995160
3
gravatar for zx8754
20 months ago by
zx87549.2k
London
zx87549.2k wrote:

Try to use fixed match:

file <- gsub("./.","ab",file, fixed = TRUE)

Or

file[ file == "./." ] <- "ab"

Edit: Using example data provided by OP.

# example input data
df1 <- read.table(text = "
S.no chr pos gene_name S1 S2 S3
1        1    1290    X            ./.   1/1  ./.
2        1     5822    Y           0/1   ./.   ./.", header = TRUE, stringsAsFactors = FALSE)

df1
#   S.no chr  pos gene_name  S1  S2 S3
# 1    1   1 1290         X  ab 1/1 ab
# 2    2   1 5822         Y 0/1  ab ab

df1[, c("S1", "S2", "S3")][ df1[, c("S1", "S2", "S3")] == "./." ] <- "ab"
df1
#   S.no chr  pos gene_name  S1  S2 S3
# 1    1   1 1290         X  ab 1/1 ab
# 2    2   1 5822         Y 0/1  ab ab
ADD COMMENTlink modified 20 months ago • written 20 months ago by zx87549.2k

Thanks for your answer. The second one worked but it does not show "ab" instead it is blank and has "NA"s

ADD REPLYlink written 20 months ago by Inquisitive8995160

Provide reproducible example input data, and expected output.

ADD REPLYlink written 20 months ago by zx87549.2k

I have edited my question with an example input and output. Thanks

ADD REPLYlink written 20 months ago by Inquisitive8995160

Edited my answer, see if it works.

ADD REPLYlink written 20 months ago by zx87549.2k
3
gravatar for cpad0112
20 months ago by
cpad011213k
India
cpad011213k wrote:
df1=read.csv("test.txt", header = T, strip.white = T, stringsAsFactors = F, sep = "\t")
library(stringr)
library(dplyr)
> df1 %>%  mutate_all(funs(str_replace_all(.,"\\.[/|\\|]\\.","ab")))
  S.no chr  pos gene_name  S1  S2 S3
1    1   1 1290         X  ab 1/1 ab
2    2   1 5822         Y 0/1  ab ab

You can also use apply function:

> library(stringr)
> apply(df1,2, function(x) str_replace_all(x,"\\.[/|\\|]\\.","ab"))
     S.no chr pos    gene_name S1    S2    S3  
[1,] "1"  "1" "1290" "X"       "ab"  "1/1" "ab"
[2,] "2"  "1" "5822" "Y"       "0/1" "ab"  "ab"

This is supposed to replace both ./. and .|.. test.txt is OP input text.

ADD COMMENTlink modified 20 months ago • written 20 months ago by cpad011213k

I want only the ./. to be modified not the 1/1.

ADD REPLYlink written 20 months ago by Inquisitive8995160

oops..typo. It doesn't replace any character other than . (./. or .|.). Edited OP. Inquisitive8995

ADD REPLYlink written 20 months ago by cpad011213k

Moved your post to an answer, you might want to clean up your above comments, or edit them in into this post.

ADD REPLYlink written 20 months ago by zx87549.2k

Thanks zx8754

ADD REPLYlink written 20 months ago by cpad011213k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1961 users visited in the last hour