Question: Remove redundant name in a list
0
gravatar for yasjas
4.3 years ago by
yasjas70
United Kingdom
yasjas70 wrote:

[[1]]

rep_name rep_family gene_name Hepatocytes_B1 Hepatocytes_B3 Huh.7_B1 Huh.7_B2
 HERVL40-int        LTR    ANKIB1         7.2268         7.3056   7.2132   7.5750
HERVL40-int        LTR    ANKIB1         7.2268         7.3056   7.2132   7.5750
HERVL40-int        LTR    ANKIB1         7.2268         7.3056   7.2132   7.5750
HERVL40-int        LTR    ANKIB1         7.2268         7.3056   7.2132   7.5750
HERVL40-int        LTR   PRKAR2B     6.4382         2.2347   7.6774   6.6859
HERVL40-int        LTR     REV3L         6.6961         6.4858   4.1992   4.7723
HERVL40-int        LTR     POMT2         5.6758         5.7517   5.8600   6.1739

[[2]]

 rep_name rep_family gene_name Hepatocytes_B1 Hepatocytes_B3 Huh.7_B1 Huh.7_B2
HUERS-P3-int        LTR      ODZ1         5.6166         2.8973   1.5077   0.5965
 HUERS-P3-int        LTR      ODZ1         5.6166         2.8973   1.5077   0.5965
HUERS-P3-int        LTR      ODZ1         5.6166         2.8973   1.5077   0.5965
HUERS-P3-int        LTR      ODZ1         5.6166         2.8973   1.5077   0.5965
HUERS-P3-int        LTR      ODZ1         5.6166         2.8973   1.5077   0.5965
HUERS-P3-int        LTR      ODZ1         5.6166         2.8973   1.5077   0.5965
HUERS-P3-int        LTR     CNTN1         2.2008         1.1640   1.3469   1.6292
HUERS-P3-int        LTR     CNTN1         2.2008         1.1640   1.3469   1.6292
 HUERS-P3-int        LTR     CNTN1         2.2008         1.1640   1.3469   1.6292
 HUERS-P3-int        LTR     CNTN1         2.2008         1.1640   1.3469   1.6292
HUERS-P3-int        LTR     CNTN1         2.2008         1.1640   1.3469   1.6292
HUERS-P3-int        LTR     MIPEP         4.2390         5.4311   7.9192   5.7850

Hello guys,

I have these lists and I want to keep only the lines that are not repeated. for example some of the gene names appear more than once and I want to count it only once.

does anyone knows how I can keep only once each gene name and remove the duplicate ones from a list?

thanks for any suggestion

 

 

 

R • 893 views
ADD COMMENTlink modified 4.3 years ago by michael.ante3.5k • written 4.3 years ago by yasjas70

nevermind sorry that was a stupid question,ignore it

ADD REPLYlink written 4.3 years ago by yasjas70
1

Google must have helped you if you spend some time on it. First hit of google is http://stackoverflow.com/questions/13967063/remove-duplicate-rows-in-r. You can find some stuff here on biostars also Removing Duplicate Rows(Gene Names)

ADD REPLYlink written 4.3 years ago by venu6.3k
0
gravatar for michael.ante
4.3 years ago by
michael.ante3.5k
Austria/Vienna
michael.ante3.5k wrote:

Hi,

a quick and dirty approach is to use the unique() function for the gene names and loop over them, checking the number of occurrences.:

res<-c()
u<-unique(x$gene_name)
for(i in u){
if(length(which(x$gene_name==i))==1){res=rbind(res,x[which(x$gene_name==i),])}
}

 

Avoid loops in R use C: Remove redundant name in a list instead.

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by michael.ante3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 755 users visited in the last hour