Question: Replace symbols in R table
1
gravatar for Inquisitive8995
8 months ago by
Inquisitive899580 wrote:

Hello,

I have a table in R which consists of 70k rows and 37 columns. Lot of cells have "./." which I want to modify and make it as "ab" . I tried to use gsub() but it does not give me the required output.

I used :

file <- gsub("./.","ab",file)

I want the change to happen throughout the file. Is there any other way with which I can modify it? Thanks in advance.

Input: Eg:

S.no chr pos gene_name S1 S2 S3
1        1    1290    X            ./.   1/1  ./.
2        1     5822    Y           0/1   ./.   ./.

Output

S.no chr pos gene_name S1 S2 S3
    1        1    1290    X            ab   1/1  ab
    2        1     5822    Y           0/1   ab   ab

It can be either ab or NA

gsub R vcf • 380 views
ADD COMMENTlink modified 8 months ago by Santosh Anand4.7k • written 8 months ago by Inquisitive899580
3
gravatar for zx8754
8 months ago by
zx87547.3k
London
zx87547.3k wrote:

Try to use fixed match:

file <- gsub("./.","ab",file, fixed = TRUE)

Or

file[ file == "./." ] <- "ab"

Edit: Using example data provided by OP.

# example input data
df1 <- read.table(text = "
S.no chr pos gene_name S1 S2 S3
1        1    1290    X            ./.   1/1  ./.
2        1     5822    Y           0/1   ./.   ./.", header = TRUE, stringsAsFactors = FALSE)

df1
#   S.no chr  pos gene_name  S1  S2 S3
# 1    1   1 1290         X  ab 1/1 ab
# 2    2   1 5822         Y 0/1  ab ab

df1[, c("S1", "S2", "S3")][ df1[, c("S1", "S2", "S3")] == "./." ] <- "ab"
df1
#   S.no chr  pos gene_name  S1  S2 S3
# 1    1   1 1290         X  ab 1/1 ab
# 2    2   1 5822         Y 0/1  ab ab
ADD COMMENTlink modified 8 months ago • written 8 months ago by zx87547.3k

Thanks for your answer. The second one worked but it does not show "ab" instead it is blank and has "NA"s

ADD REPLYlink written 8 months ago by Inquisitive899580

Provide reproducible example input data, and expected output.

ADD REPLYlink written 8 months ago by zx87547.3k

I have edited my question with an example input and output. Thanks

ADD REPLYlink written 8 months ago by Inquisitive899580

Edited my answer, see if it works.

ADD REPLYlink written 8 months ago by zx87547.3k
3
gravatar for cpad0112
8 months ago by
cpad011211k
India
cpad011211k wrote:
df1=read.csv("test.txt", header = T, strip.white = T, stringsAsFactors = F, sep = "\t")
library(stringr)
library(dplyr)
> df1 %>%  mutate_all(funs(str_replace_all(.,"\\.[/|\\|]\\.","ab")))
  S.no chr  pos gene_name  S1  S2 S3
1    1   1 1290         X  ab 1/1 ab
2    2   1 5822         Y 0/1  ab ab

You can also use apply function:

> library(stringr)
> apply(df1,2, function(x) str_replace_all(x,"\\.[/|\\|]\\.","ab"))
     S.no chr pos    gene_name S1    S2    S3  
[1,] "1"  "1" "1290" "X"       "ab"  "1/1" "ab"
[2,] "2"  "1" "5822" "Y"       "0/1" "ab"  "ab"

This is supposed to replace both ./. and .|.. test.txt is OP input text.

ADD COMMENTlink modified 8 months ago • written 8 months ago by cpad011211k

I want only the ./. to be modified not the 1/1.

ADD REPLYlink written 8 months ago by Inquisitive899580

oops..typo. It doesn't replace any character other than . (./. or .|.). Edited OP. Inquisitive8995

ADD REPLYlink written 8 months ago by cpad011211k

Moved your post to an answer, you might want to clean up your above comments, or edit them in into this post.

ADD REPLYlink written 8 months ago by zx87547.3k

Thanks zx8754

ADD REPLYlink written 8 months ago by cpad011211k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 737 users visited in the last hour