Question: Chip/Rna/Dnase/Etc...-Seq: Search For Similar Tracks
0
gravatar for 9606
7.7 years ago by
9606320
Italy
9606320 wrote:

Hi all, I have a curiosity.

Suppose you have a track, let's say generated from a ChiP-seq experiment. Do you think that it could be usefull to search, among a set of tracks (all generated by ChiP-seq experiements), for some tracks that have a similar shape to the given one? And is so, could you please list some very simple biological questions that could be answered ny such "similarity searching" as they come to your mind?

Thank you, best regards.

similarity ngs • 1.6k views
ADD COMMENTlink modified 7.7 years ago • written 7.7 years ago by 9606320

your question is overly generic "similar tracks" could mean just about anything "listing some application" is also too broad as a term

ADD REPLYlink written 7.7 years ago by Istvan Albert ♦♦ 85k

Dear Istvan, I edited my question. Do you think that now it is better?

ADD REPLYlink written 7.7 years ago by 9606320
2
gravatar for Sukhdeep Singh
7.7 years ago by
Netherlands
Sukhdeep Singh10k wrote:

Thats why we call peaks (significant binding sites) and overlap them for the different proteins/TF's of interest, to check the similarity.

You can subset them for a specific region of interest or whole genome. Tools like Bedtools Compare Multiple Bed Files? , Macs and in general, R/Bioconductor are helpful to move forward to specifically what you want.

You can also try few papers in regard :

A co-localization model of paired ChIP-seq data using a large ENCODE data set enables comparison of multiple samples

An effective statistical evaluation of ChIPseq dataset similarity

Edit: Now you have edited the question, so

  • If you think the proteins are a part or the potential subunits of the same complex, their binding sites might mostly overlap.
  • Regarding shape, you cannot always be completely sure, because of the noise in data, but you can infer cases like, proteinA has normal distribution but proteinB is giving a normal and a small right skewed distribution. It might mean, its extends to genebody (if looking at TSS + depending on the genomic locus you take into account as reference).
  • If you make composite profile (abundance profile) of a protein, it can tell you where its mostly abundant (TSS, Genebody, TTS etc.)
  • For other deep analysis, you have to play with data more deeply apart from the visual analysis of the peak shapes.

Cheers

ADD COMMENTlink modified 7.7 years ago • written 7.7 years ago by Sukhdeep Singh10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2037 users visited in the last hour