Question: how the duplicated interactions looks like in HPRD data ?
0
gravatar for aouichechaimaa
5 months ago by
aouichechaimaa90 wrote:

Hi, guys, I have downloaded HPRD database Release 9 which have 39240 interactions and i want to delete the self-interactions and duplicated interactions programmaticaly, but i don't know how the duplicated interactions looks like in this data .

I mean self-interactions like this:FES FES, but what about duplicated interactions looks like ? Plz I appreciate any help !

hprd • 186 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by aouichechaimaa90
1
gravatar for Jean-Karim Heriche
5 months ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche13k wrote:

The way to go is to map all identifiers used by HPRD to the same annotated reference genome e.g. EnsEMBL to make sure that each protein has the same ID throughout the data then look for multiple occurrences of ID1-ID2 and ID2-ID1. Note that HPRD data is >7 years old and some of the identifiers used may be obsolete.

ADD COMMENTlink written 5 months ago by Jean-Karim Heriche13k

@Jean-Karim Heriche Friend is HPRD data release 9 has such form of duplicated interactions? like {'A' 'B' } and {'B' 'A'}?

ADD REPLYlink written 5 months ago by aouichechaimaa90

I don't remember and I don't have the data anymore. Anyway, my scripts were always set up to remove duplicates unless I cared about the distinction (e.g. different types of experiments). You don't say what you're trying to do but if you want a more comprehensive set of human protein interactions, I would suggest to use iRefIndex.

ADD REPLYlink written 5 months ago by Jean-Karim Heriche13k

I have the HPRD data release9 as a textfile and I want to remove the duplicated interactions from it.

ADD REPLYlink written 5 months ago by aouichechaimaa90

I understood that you have HPRD data and want to remove duplicates. I already answered this: just write your data processing script in such a way that if there are duplicates, it deals with them in the way you want. If you just want to know whether or not there are duplicates, just write a simple script to find out. By "what you're trying to do", I was referring to what biological question you're trying to answer and wondering whether HPRD is the best data set for this.

ADD REPLYlink written 5 months ago by Jean-Karim Heriche13k

I want to build a network by linking the different list of genes I found based on any human data (should be human data) is iRefIndex.can do this job? is it human data?

ADD REPLYlink written 5 months ago by aouichechaimaa90
1

IRefIndex is a compilation of several protein-protein interaction databases and so includes human data. Read the paper to understand how it's done. To get human data only, just filter on the taxon ID in the relevant columns. So if you need to look for interactions involving genes in your lists then you're better off using iRefIndex (or any other compilation of multiple data sources) than just a single (outdated) data source.

To access the iRefIndex data, you can also use the iRefR package for R and there's a plug-in for Cytoscape 2.8. Finally there's also a web interface at iRefWeb.

ADD REPLYlink written 5 months ago by Jean-Karim Heriche13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1392 users visited in the last hour