Entering edit mode
2.2 years ago
berlin
•
0
I have three different groups of gene sets (Control, Treat1 , Treat 2), and for TFBS prediction of each group, I have tried to run FIMO(= Find Individual Motif Occurrences ) for each group from MEME suit. For individual transcription factors, HOCOMOCO data was used with their official TF name.
With result of FIMO, I have acquired multiple grange objects. As there are 763 TF from HOCOMOCO, I will have around 3763 each promotor seq.
So my question will be,
- Would be ideal to merge all the dataset from each group ?
- If yes, how should I merge this multiple grange object?
- For interpretation of fimo result, would it be wise to compare motif occurence?
- Is there any better way to compare TFBS prediction from different groups of gene set, only with their name. (not with chip data)
- targeted sequences are was upstream 5k downstream 3k from TSS
Example of code for fimo :
dummyseq <- getSeq(BSgenome.Hsapiens.UCSC.hg38, found_PR)
writeXStringSet(dummyseq, file="DummySeq.fasta")
foo_dummy<- MotifDb::MotifDb %>% # Query the database for the HOCOMOCO motif using it's gene name
MotifDb::query(HOCOMOCO$`Transcription factor`[1]) %>% # Convert from motifdb format to universalmotif format
universalmotif::convert_motifs() %>%
.[[1]] # The result is a list, to simplify the object, return it as a single universalmotif
Foo_dummy_Fimo_result_CT <- runFimo(DummySeq.fasta, foo_dummy)
class(Foo_dummy_Fimo_result_CT) # grange
length(Foo_dummy_Fimo_result_CT) # 1428
In the end, tried to convert grange data to dataframe with
annoGR2DF
function and rbind them. It works!