Merging datasets from different arrays - need quick way to generate the intersection of intersections
Entering edit mode
6.9 years ago
devenvyas ▴ 680

I have downloaded a number of Plink SNP data sets from a region of the world I am interested in. These datasets were created by different means that my own, and thus contain different SNPs

For each of these, I have generated lists of RSids in common between my data and one of the individual datasets (i.e., intersection of my dataset and one other dataset); thus, I have about five or six lists of RSids.

I need a quick and easy way to get the RSids that appear in last five or six lists in R or shell or Python.

Any suggestions on how to do this?

R python shell SNP • 1.9k views
Entering edit mode
6.9 years ago
devenvyas ▴ 680

You can ignore this, I figured it out

a <- read.table("Ethiopia.snps", header=F)
b <- read.table("GIH.snps", header=F)
c <- read.table("LWK.snps", header=F)
d <- read.table("MKK.snps", header=F)
e <- read.table("NAfrica.snps", header=F)

intersect(intersect(a$V1,b$V1),intersect(intersect(c$V1,d$V1),e$V1)) -> consensus
write.table(consensus, quote=F, file="consensus.txt")



Login before adding your answer.

Traffic: 1519 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6