This distinction has more to do with machine learning algorithm categories. While clustering is considered a subcategory of "machine learning," in your case what you're doing is mostly considered linear algebra.
Pre-filtering does not affect the category: the algorithm sees only the data, which in this case is an N-dimensional geometric space from which some sort of sample-wise distance is calculated. You can influence the way that clustering happens within pheatmap by using a different distance metric (e.g. "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski") or by changing algorithm parameters (pheatmap uses k-means, so changing k).
You can also read more about different hierarchical joining methods by reading up on hclust, which is the function underlying pheatmap:
Ward's minimum variance method aims at finding compact, spherical clusters. The complete linkage method finds similar clusters. The single linkage method (which is closely related to the minimal spanning tree) adopts a ‘friends of friends’ clustering strategy. The other methods can be regarded as aiming for clusters with characteristics somewhere between the single and complete link methods. Note however, that methods "median" and "centroid" are not leading to a monotone distance measure, or equivalently the resulting dendrograms can have so called inversions or reversals which are hard to interpret, but note the trichotomies in Legendre and Legendre (2012).
Supervised in most machine learning contexts means using prior information (prior data) in order to inform a decision about new data, given some category of algorithm.
Unsupervised means using only the data itself to make some decisions about the data, again given some category of algorithm.
Don't worry too much about this distinction for practical purposes, unless you're curious about the subject matter itself.