If I have a predicted sequence for a protein and I want to find where it is located on a genome and find other similar sequences to it on the genome, how would I go about doing this and which tools would be best for this task?
If I have a predicted sequence for a protein and I want to find where it is located on a genome and find other similar sequences to it on the genome, how would I go about doing this and which tools would be best for this task?
If we are talking about organisms without introns, it is very simple. A tool to use is tblastn
which can be accessed from the main BLAST page. You can download the BLAST package and run the search locally. There are many other sequence search programs such as DIAMOND that can perform the same type of analysis.
It is more complicated for organisms with discontinuous genes, but it is the same general principle.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you for your answer! If the organism I am studying has introns, will blastn still be sufficient?
It may or may not, depending on the exact organism. For example, baker's yeast has introns but not many, and most genes in its genome don't have any.
Generally speaking,
tblastn
is not the best choice for genes that have many short pieces, but you didn't provide enough information for me to give you a definitive advice. You should be able to research this yourself as this is one of the well-understood procedures.