Question: Covert Many Lines In A Specific Line
0
gravatar for viniciushs88
6.0 years ago by
viniciushs8850
Germany
viniciushs8850 wrote:

0 down vote favorite

I would like to transform this data:

Sample  Genotype  Region
sample1    A      Region1
sample1    B      Region1
sample1    A      Region1
sample2    A      Region1
sample2    A      Region1
sample3    A      Region1
sample4    B      Region1

In that format:

Sample  Genotype  Region   
sample1    E      Region1
sample2    A      Region1
sample3    A      Region1
sample4    B      Region1

I wanna to tag excluded (E) in "Genotype" column in an unified line to samples with more than one genotype (sample1) and just unify lines to samples with genotype repeated in two lines (sample2). I have one list with many regions (Region1 - Regionx). It is possible to do in R software? Thanks a lot.

R • 2.0k views
ADD COMMENTlink modified 6.0 years ago by Devon Ryan93k • written 6.0 years ago by viniciushs8850
1
gravatar for Devon Ryan
6.0 years ago by
Devon Ryan93k
Freiburg, Germany
Devon Ryan93k wrote:

Given the above in a data.frame called d:

d2 <- unique(d) #Collapse duplicates, e.g., "sample2"
d2$Genotype <- factor(d2$Genotype, levels=c(levels(d2$Genotype), "E")) #Add a level "E" to Genotype
d2[duplicated(d2$Sample),2] <- "E" #Label "E" lines
d2 <- d2[-duplicated(d2$Sample, fromLast=T)==F,] #Remove the non-labeled "E" lines that should still be excluded
ADD COMMENTlink written 6.0 years ago by Devon Ryan93k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1929 users visited in the last hour