Question: Merging datasets from different arrays - need quick way to generate the intersection of intersections
1
gravatar for devenvyas
3.8 years ago by
devenvyas570
Stony Brook
devenvyas570 wrote:

I have downloaded a number of Plink SNP data sets from a region of the world I am interested in. These datasets were created by different means that my own, and thus contain different SNPs

For each of these, I have generated lists of RSids in common between my data and one of the individual datasets (i.e., intersection of my dataset and one other dataset); thus, I have about five or six lists of RSids.

I need a quick and easy way to get the RSids that appear in last five or six lists in R or shell or Python.

Any suggestions on how to do this?

python snp shell R • 1.4k views
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by devenvyas570
4
gravatar for devenvyas
3.8 years ago by
devenvyas570
Stony Brook
devenvyas570 wrote:

You can ignore this, I figured it out

a <- read.table("Ethiopia.snps", header=F)
b <- read.table("GIH.snps", header=F)
c <- read.table("LWK.snps", header=F)
d <- read.table("MKK.snps", header=F)
e <- read.table("NAfrica.snps", header=F)

intersect(intersect(a$V1,b$V1),intersect(intersect(c$V1,d$V1),e$V1)) -> consensus
write.table(consensus, quote=F, file="consensus.txt")

 

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by devenvyas570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1424 users visited in the last hour