Hi Marek,
I need to point out misleading information in the other reply, about "specifying depth without over-broadening". The level of the term in GO has no correlation with information content; for one, they have no defined level, as mentioned in our FAQ:
How can I calculate the “level” of a GO term?
GO terms do not occupy strict fixed levels in the hierarchy. Because
GO is structured as a graph, terms would appear at different ‘levels’
if different paths were followed through the graph. This is especially
true if one mixes the different relations used to connect terms.
A more informative metric would be the information content of the node
based on annotations. See, for example, the work of Alterovitz et
al.
GO was not designed to be a defining answer for the beginning of projects, but is instead a controlled language that allows you to take information from some work (here, your BLAST alignments) and aid your next steps. For this question, the better your alignment, the more confidence I'd have in the assignment of a GO term to that domain/gene product. If you can compare your sequence to the match with multiple methods, that's one way to increase the confidence in assigning a GO term. In fact, evaluating BLAST matches is so frequently done in GO, there's a specific type of evidence code for it when MODs or other professional curators use that as the method to assign an "official" GO term to a gene product. Here's an example from the GO curation guidelines:
An ISS annotation is often based on more than just one type of
sequence-based evidence. Often, a host of searches are performed for
any given query protein. These searches might include BLAST, profile
HMMs, TMHMM, SignalP, PROSITE, InterPro, etc. Evaluation of output
from these search tools (bear in mind that every search may not yield
results for every protein) leads an annotator to a particular ISS
annotation for a particular protein. For example, a BLAST search might
reveal that a query protein matches an experimentally characterized
protein from another species at 50% identity over the full lengths of
both proteins. After reading literature about the match protein, the
curator sees that the match protein is known to contain a domain
located in the plasma membrane and another domain that extends into
the cytoplasm. It is also known from the literature that the
experimentally characterized match protein requires the binding of ATP
to function. TMHMM analysis of the query protein predicts several
membrane spanning regions in one half of the protein (consistent with
location in a membrane). In addition there are PROSITE and Pfam
results which reveal the presence of an ATP-binding domain in the
other half of the protein which TMHMM predicts to be cytoplasmic.
These four search results taken together point to a probable
identification of the query protein as having the function of the
match protein.
Lastly, you mention you "cannot really use go-slim because I am working with non-annotated sequences." You absolutely can use a GO slim (aka GO Subset), like the Generic GO Slim. The point of the slim is to summarize a set of GO terms, you can imagine it as a set of relatively high-level buckets that, collectively, contain (nearly) all the terms in the ontology and help you simplify a large list of terms. Once you have a set of terms from your alignments, you can just use the full ontology and map your terms up to terms in the Slim. I recommend the generic slim, but you might find an option like the yeast or plant slims more closely reflects your organism.
Grab the correct slim file at https://geneontology.org/docs/go-subset-guide/
You can get the ontology file from https://geneontology.org/docs/download-ontology/
Let me know if you need more information.