funMotifs is a web-based tool designed for annotating noncoding variants and genomic regions
We have collected ChIP-seq, DNase-I, and other assays from ENCODE, FANTOM, RoadMap epigenomics and other data sources. We have used the data to annotate motifs of 510 transcription factors in 14 tissue types.
You can upload a list of variants or genomic coordinates and the tool will report the overlapping TF motifs and their annotations in a selected tissue type. A typical use-case is to annotate mutations from a certain cancer type with annotations from a corresponding tissue type. In order to summarize the annotations, we have applied a logistic regression model that enables prioritization of the variants and motifs.
The tool is open source and the pipeline is implemented in Python (source on GitHub). PostgreSQL is used to store and index the data that allows for very quick annotation retrieval.
The pipeline allows for re-generation of the annotations on a local computer to annotate larger sets of variants using a programming interface. Your comments and suggestions are appreciated.
Here is a link to the website: http://bioinf.icm.uu.se/funmotifs/
Hi husensofteng , please allow me to make a couple of comments that are not meant to be impolite but rather to improve the overall appearance of the linked content:
HiC contacting domains from GTEx and ENCODEis misleading because GTEx does not contain Hi-C data (to my best knowledge, maybe I am wrong) and the linked paper (link is broken) refers to Rao 2014 which I think is not associated with GTEx but rather ENCODE
Again, please consider the comments a suggestion to improve your project. The thing is that there are already a number of tools to predict the impact of nc-variants, e.g. FunSeq2, and people need motivation to choose your approach over the other available ones. Currently, due to the above points I would be reluctant to choose your tool, especially because of the lack of hg38 support (correct me if you support this version).
Thank you so much ATpoint for these valuable points. I have gone through them all and I have tried to fix the errors. I just learned that it's possible to do spell corrections directly in vim (:set spell spelllang=en_us) :)
We hope to manage rerunning the pipeline on hg38 and incorporating the more recent datasets that have been generated as well as ATAC-seq experiments. This would give us the chance to add more tissue types too.
I think the main motivation for using our tool, in comparison to the pre-existing ones, is in the tissue-type specific scoring system that provides a summary score for over 80m motifs across 15 tissue types. I will get back to you as soon as I have some news on the hg38 and inclusion of more assay types.
I would be glad to receive your further suggestions and thanks for the comments.