Merging datasets from different arrays - need quick way to generate the intersection of intersections
1
1
Entering edit mode
8.9 years ago
devenvyas ▴ 740

I have downloaded a number of Plink SNP data sets from a region of the world I am interested in. These datasets were created by different means that my own, and thus contain different SNPs

For each of these, I have generated lists of RSids in common between my data and one of the individual datasets (i.e., intersection of my dataset and one other dataset); thus, I have about five or six lists of RSids.

I need a quick and easy way to get the RSids that appear in last five or six lists in R or shell or Python.

Any suggestions on how to do this?

shell python SNP R • 2.2k views
ADD COMMENT
4
Entering edit mode
8.9 years ago
devenvyas ▴ 740

You can ignore this, I figured it out

a <- read.table("Ethiopia.snps", header=F)
b <- read.table("GIH.snps", header=F)
c <- read.table("LWK.snps", header=F)
d <- read.table("MKK.snps", header=F)
e <- read.table("NAfrica.snps", header=F)

intersect(intersect(a$V1,b$V1),intersect(intersect(c$V1,d$V1),e$V1)) -> consensus
write.table(consensus, quote=F, file="consensus.txt")
ADD COMMENT

Login before adding your answer.

Traffic: 1287 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6