Retrieve transcription factors from gene set using biomaRt
1
0
Entering edit mode
20 months ago

I have a dataframe meth which has genes (HGNC symbol) as rownames and samples as column names. I want to find which gene in the rownames are transcription factors using biomaRt in R. This list should be returned as a vector.

Example:

> rownames(meth)
   [1] "A1BG"            "A1CF"            "A2BP1"           "A2LD1"          
   [5] "A2M"             "A2ML1"           "A4GALT"          "AAAS"   

If AIBG, A2BP1, and A2LD1 are transcription factors, return as vector:

[1] "A1BG"  "A2BP1" "A2LD1"

On the biomart website, I can choose for example: Database: Ensembl Regulation 107 Dataset: Human Regulatory Features

But I want to find the TFs using R code.

My preliminary attempt did not filter for transcription factors.

# Biomart query
if(interactive()){
  mart <- useEnsembl(biomart = "ensembl",
                     dataset = "hsapiens_gene_ensembl")
  getBM(attributes = c("ensembl_gene_id", "p_value", "hgnc_symbol", "entrezgene_id"),
        values = as.vector(rownames(meth)),
        mart = mart)
}
factors R transcription biomaRt • 852 views
ADD COMMENT
0
Entering edit mode

Hi, you could download the list of human TFs from this website: http://humantfs.ccbr.utoronto.ca/download.php

(This TF list is part of this Cell review: https://www.sciencedirect.com/science/article/pii/S0092867418301065?via%3Dihub)

Then, you can check which of these TFs match with the rownames of your dataframe using a R function like inner_join from the dplyr package.

ADD REPLY
0
Entering edit mode
20 months ago
ATpoint 82k

Use a dedicated database rather than reinventing the wheel. We usually use http://bioinfo.life.hust.edu.cn/AnimalTFDB/#!/download

ADD COMMENT

Login before adding your answer.

Traffic: 1868 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6