Tutorial:Finding individual motif occurrences with FIMO from the MEME suite
0
5
Entering edit mode
5.4 years ago
ATpoint 82k

A common task, frequently asked here on Biostars, is to find out if a given DNA sequence contains certain motifs. This can be done with Find Individual Motif Occurrences (FIMO) from the MEME suite. In this example, we check a stretch of DNA around the first exon of the human BCL6 gene for motif occurrences against all motifs listed in the JASPAR vertebrate core collection.

Coordinates of the query sequence (hg38) chr3:187744307-187746589

## Get JASPAR motifs (vertebrate non-redundant core collection) in meme format:
wget http://jaspar.genereg.net/download/CORE/JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.zip

## Unzip:
unzip JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.zip
cd JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme

## Combine into one file:
find ./ -maxdepth 1 -name "*.meme" | xargs cat > combined.meme

## Install fimo (part of MEME):
conda install -c bioconda meme

## if fimo complains about libiconv libraries, also install that manually:
conda install -c conda-forge libiconv 

## run fimo:
fimo --parse-genomic-coord combined.meme input.fa

The input.fa here looks like:

>chr3:187744307-187746589
(sequence...)

When specifying the genomic coordinates of the sequence in the fasta header in the form chr-start:end (1-based coordinates) and using the --parse-genomic-coord option of fimo, the resulting GFF file will show the exact coordinates of the motif in the genome.

Check output in gff format:

head fimo_out/fimo.gff


##gff-version 3
chr3    fimo    nucleotide_motif    187745593   187745603   43.9    -   .   Name=MA0002.2_chr3-;Alias=RUNX1;ID=MA0002.2-RUNX1-1-chr3;pvalue=4.11e-05;qvalue= 0.177;sequence=TCTTGTGGCTT;
chr3    fimo    nucleotide_motif    187746233   187746243   40.4    +   .   Name=MA0002.2_chr3+;Alias=RUNX1;ID=MA0002.2-RUNX1-2-chr3;pvalue=9.11e-05;qvalue= 0.196;sequence=GTTTGTGGTGT;
chr3    fimo    nucleotide_motif    187744975   187744985   41.1    +   .   Name=MA0003.3_chr3+;Alias=TFAP2A;ID=MA0003.3-TFAP2A-1-chr3;pvalue=7.81e-05;qvalue= 0.323;sequence=CCCCCCAAGCA;
chr3    fimo    nucleotide_motif    187745763   187745774   41.9    +   .   Name=MA0018.3_chr3+;Alias=CREB1;ID=MA0018.3-CREB1-1-chr3;pvalue=6.41e-05;qvalue= 0.146;sequence=TGTGACGTCGGC;
chr3    fimo    nucleotide_motif    187745763   187745774   41.9    -   .   Name=MA0018.3_chr3-;Alias=CREB1;ID=MA0018.3-CREB1-2-chr3;pvalue=6.41e-05;qvalue= 0.146;sequence=GCCGACGTCACA;
chr3    fimo    nucleotide_motif    187746240   187746250   50.7    -   .   Name=MA0025.1_chr3-;Alias=NFIL3;ID=MA0025.1-NFIL3-1-chr3;pvalue=8.51e-06;qvalue= 0.0387;sequence=TTACGTAACAC;
chr3    fimo    nucleotide_motif    187746378   187746388   40.5    +   .   Name=MA0025.1_chr3+;Alias=NFIL3;ID=MA0025.1-NFIL3-2-chr3;pvalue=8.97e-05;qvalue= 0.204;sequence=ATATGTAACAA;
chr3    fimo    nucleotide_motif    187745661   187745670   40.4    -   .   Name=MA0028.2_chr3-;Alias=ELK1;ID=MA0028.2-ELK1-1-chr3;pvalue=9.09e-05;qvalue= 0.412;sequence=ACCGGAACCT;
chr3    fimo    nucleotide_motif    187745215   187745225   47.4    +   .   Name=MA0032.2_chr3+;Alias=FOXC1;ID=MA0032.2-FOXC1-1-chr3;pvalue=1.81e-05;qvalue= 0.0779;sequence=TAAATAAATAT;
motif meme jaspar fimo • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 2966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6