Question: extracting a file from a bigger file
0
gravatar for Fereshteh
10 weeks ago by
Fereshteh2.7k
Fereshteh2.7k wrote:

Hi,

I have two files

TF      gene    interaction
AATF    BAK1    Unknown
AATF    BAX     Repression
AATF    MYC     Activation

And:

TF      gene
KAT5    KLRC4-KLRK1
EGF     ABCB6
ETS1    CDKN2A

how I can extract the small file from the bigger one that contains interaction types???

I tried join, intersect, match but I can only do for gene or TF not for both simultaneously

by the below code only TFs are common between two files

df3 <- merge( dat1, dat2, by.x = "TF", by.y = "gene" )
R software error • 384 views
ADD COMMENTlink modified 10 weeks ago by Ram12k • written 10 weeks ago by Fereshteh2.7k
1

There are no common entities (genes or TF) between two dfs here (dat1 and dat2) in the above example. Output would be empty

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by cpad01122.3k

but there is common TFs and genes I checked with venn diagram :(

ADD REPLYlink written 10 weeks ago by Fereshteh2.7k
1

Are there common genes/symbols between "genes" column of dat1 and "TF" of dat2?

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by cpad01122.3k
1

Created an example dataset from above lists:(dat1 is list1 and dat2 is list2). Since there are no common entities between list1 and list2, I copy pasted very first line to list2 (dat2) from list1 (dat1).

Following is the code:

>library(dplyr)

> list1
    TF gene interaction
1 AATF BAK1     Unknown
2 AATF  BAX  Repression
3 AATF  MYC  Activation

> list2
    TF        gene
1  EGF       ABCB6
2 ETS1      CDKN2A
3 KAT5 KLRC4-KLRK1
4 AATF        BAK1

Find out the common entities between the files matching both gene and TF.

> inner_join(list1,list2)
Joining, by = c("TF", "gene")
    TF gene interaction
1 AATF BAK1     Unknown
ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by cpad01122.3k

yes there are common TFs between genes too

ADD REPLYlink written 10 weeks ago by Fereshteh2.7k
1

You are joining dat1 (df1) and dat2 (df2) not by same columns (i.e genes with genes, TF with TF), but by different columns (genes from df2 with TF from df1). Is that intentional?

ADD REPLYlink written 10 weeks ago by cpad01122.3k

exactly this is my mean, thank you for your kindly efforts, I will try your solutions

I want to extract intractions from my bigger file that contains common genes and TF existed in my small file.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by Fereshteh2.7k
1

I guess there is a small confusion here. From the command you have posted here, it is to my understanding that you are supposed to match between genes of one file and tf of another file. But from your above comment, what you want to do is different. There are two files. One small and one big. Both the files have gene and TF list. You just want to extract information from big file (probably with interaction term) using entries from small file and you want to use both Genes and TF.

Which one is correct? If it is the later one, then code is different and updated.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by cpad01122.3k

exactly your second assumption is correct that Ram's solution game me this

https://ibb.co/ffiPUv

thank you both. when I merge first for TFs and then for genes one by one I got different interactions

ADD REPLYlink written 10 weeks ago by Fereshteh2.7k
1
gravatar for Ram
10 weeks ago by
Ram12k
New York
Ram12k wrote:

You wish to use 2 "columns" to join your dataset, correct? Given that your datasets have identical colnames for those columns, substitute your by.x="TF",by.y="gene" with by=c("TF","gene") (which is a short version of by.x=c("TF","gene"),by.y=c("TF","gene")) - this will merge using matching values from both columns.

ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Ram12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 781 users visited in the last hour