I have three different groups of gene sets (Control, Treat1 , Treat 2), and for TFBS prediction of each group, I have tried to run FIMO(= Find Individual Motif Occurrences ) for each group from MEME suit. For individual transcription factors, HOCOMOCO data was used with their official TF name.
With result of FIMO, I have acquired multiple grange objects. As there are 763 TF from HOCOMOCO, I will have around 3763 each promotor seq.
So my question will be,
- Would be ideal to merge all the dataset from each group ?
- If yes, how should I merge this multiple grange object?
- For interpretation of fimo result, would it be wise to compare motif occurence?
- Is there any better way to compare TFBS prediction from different groups of gene set, only with their name. (not with chip data)
- targeted sequences are was upstream 5k downstream 3k from TSS
Example of code for fimo :
dummyseq <- getSeq(BSgenome.Hsapiens.UCSC.hg38, found_PR) writeXStringSet(dummyseq, file="DummySeq.fasta") foo_dummy<- MotifDb::MotifDb %>% # Query the database for the HOCOMOCO motif using it's gene name MotifDb::query(HOCOMOCO$`Transcription factor`) %>% # Convert from motifdb format to universalmotif format universalmotif::convert_motifs() %>% .[] # The result is a list, to simplify the object, return it as a single universalmotif Foo_dummy_Fimo_result_CT <- runFimo(DummySeq.fasta, foo_dummy) class(Foo_dummy_Fimo_result_CT) # grange length(Foo_dummy_Fimo_result_CT) # 1428