We are looking into the possibility of off-target mapping for some sequences that were added to our alignment reference. We are only concerned about off-target mapping to genes (including introns). We are also only concerned about off-target mapping in mouse.

To look at this I want to do a blast search on ~1000 sequences that mapped to our added sequence, only to the mouse genome and only to genes (including introns). I have a fasta file for mm10 and a gtf file describing the genes, but I am unsure how to put this into blast + (blast CLI) so that I know that I am searching genes.


You can make your blast database using makeblastdb and align your sequences against that database. Use -outfmt 6 for the output. Convert tabular blast+ output to BED format (Convert blast tabular output to bed format ). Then you can intersect BED output with a BED file for gene models.

A command line manual for blast+ is available here.

