Hi,
I have these lists of genes
> dim(b)
[1] 4866 2
>
How I can extract the small list (bb) from big on by the same order of genes???? I mean the order of genes from extracted file should be the same with small file(bb).
Hi,
I have these lists of genes
> dim(b)
[1] 4866 2
>
How I can extract the small list (bb) from big on by the same order of genes???? I mean the order of genes from extracted file should be the same with small file(bb).
b and bb in OP doesn't share any gene. What is being asked, is not possible when values do not exist (between two data sets). Look at the following example and see if this is what is required:
> b=data.frame(genes=paste("gene", sample(10), sep="_"), expn=round(rnorm(10,1,4),2))
> bb=data.frame(genes=paste("gene",sample(5), sep="_"))
> b
genes expn
1 gene_6 -0.78
2 gene_10 4.65
3 gene_9 1.86
4 gene_3 2.39
5 gene_1 0.34
6 gene_2 2.01
7 gene_8 -4.51
8 gene_7 -7.61
9 gene_5 1.50
10 gene_4 3.12
> bb
genes
1 gene_2
2 gene_1
3 gene_3
4 gene_4
5 gene_5
> b[match(bb$genes,b$genes),]
genes expn
6 gene_2 2.01
5 gene_1 0.34
4 gene_3 2.39
10 gene_4 3.12
9 gene_5 1.50
Add row number to b
then merge
, and finally re-order merged dataframe
based on our row number, see this example:
# example input data
b <- read.table(text = "
gene index
DDB_G0295603 0.9922432
DDB_G0295719 0.9917077
DDB_G0292120 0.3333333
DDB_G0282307 0.9876919
DDB_G0269672 0.9862853
DDB_G0269462 0.6666666
DDB_G0284895 0.9853162
DDB_G0274031 0.9803622", header = TRUE, stringsAsFactors = FALSE)
bb <- read.table(text = "
gene
DDB_G0292120
DDB_G0278649
DDB_G0288947
DDB_G0269462
DDB_G0278757
DDB_G0281793", header = TRUE, stringsAsFactors = FALSE)
# add row number
b$myOrder <- seq(nrow(b))
# then merge to get "index" and "myOrder" columns
res <- merge(bb, b, by = "gene")
# and reorder the merged dataframe
res <- res[ order(res$myOrder), ]
res
# gene index myOrder
# 2 DDB_G0292120 0.3333333 3
# 1 DDB_G0269462 0.6666666 6
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
try
b[b$gene %in% bb$gene,]
Thank you, but I need the order. I mean, for example for gene
DDB_G0292120
inbb
I want to knowthe index based on the file b
. intersection does not give me the indices based on the order of genes. I have a heat map of bb genes and I want to know their indices based on the file b