Entering edit mode
6.7 years ago
zizigolu
★
4.3k
Hi,
I have two files
TF gene interaction
AATF BAK1 Unknown
AATF BAX Repression
AATF MYC Activation
And:
TF gene
KAT5 KLRC4-KLRK1
EGF ABCB6
ETS1 CDKN2A
how I can extract the small file from the bigger one that contains interaction types???
I tried join, intersect, match but I can only do for gene or TF not for both simultaneously
by the below code only TFs are common between two files
df3 <- merge( dat1, dat2, by.x = "TF", by.y = "gene" )
There are no common entities (genes or TF) between two dfs here (dat1 and dat2) in the above example. Output would be empty
but there is common TFs and genes I checked with venn diagram :(
Are there common genes/symbols between "genes" column of dat1 and "TF" of dat2?
Created an example dataset from above lists:(dat1 is list1 and dat2 is list2). Since there are no common entities between list1 and list2, I copy pasted very first line to list2 (dat2) from list1 (dat1).
Following is the code:
Find out the common entities between the files matching both gene and TF.
yes there are common TFs between genes too
You are joining dat1 (df1) and dat2 (df2) not by same columns (i.e genes with genes, TF with TF), but by different columns (genes from df2 with TF from df1). Is that intentional?
exactly this is my mean, thank you for your kindly efforts, I will try your solutions
I want to extract intractions from my bigger file that contains common genes and TF existed in my small file.
I guess there is a small confusion here. From the command you have posted here, it is to my understanding that you are supposed to match between genes of one file and tf of another file. But from your above comment, what you want to do is different. There are two files. One small and one big. Both the files have gene and TF list. You just want to extract information from big file (probably with interaction term) using entries from small file and you want to use both Genes and TF.
Which one is correct? If it is the later one, then code is different and updated.
exactly your second assumption is correct that Ram's solution game me this
https://ibb.co/ffiPUv
thank you both. when I merge first for TFs and then for genes one by one I got different interactions