find TF binding site based on sequence
0
0
Entering edit mode
8 months ago

Hi community,

maybe someone can provide some guidance/insight and help me out.

Our group has identified a set of genes that are potential targets of a known transcription factor and we want to find a binding motif for this TF proximal to those genes. I have tried following this Bioconductor workflow:

Finding Candidate Binding Sites for Known Transcription Factors via Sequence Matching

However, since this workflow works with S.cerevisiae data and our data is from human cells, I encountered the following issue: In the workflow, there is this code line

orfs <- as.character(mget(genes, org.Sc.sgdCOMMON2ORF))

In the org.Hs.eg.db data there is no "COMMON2ORF" however.
I am now wondering what the workaround is to continue with the workflow with the human data.

I thought to use org.Hs.egCHRLOC to get the start positions for each gene of interest, however, when I provide a genes vector like in the workflow, I always get an error saying: error in .chekKeys(value, Lkeys(x), x@ifnotfound): "value for "GENE" not found". I think this is due to the presence of multiple transcripts for each gene?

Does anyone have seen this problem before and found a solution or can someone point me toward a solution? Would be highly appreciated!

motif Bioconductor TF R • 156 views
ADD COMMENT

Login before adding your answer.

Traffic: 728 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6