Subset of transcripts - Do I need to re-scale the TPM values?
I have run Kallisto using the full human transcriptome dataset GRCh37-75. This dataset includes both coding and non-coding transcripts, obviously. I would like to further analyse the coding ones. Since the TPM measurement take into account the total number of transcripts in the dataset, do I need to re-scale/re-calculate the TPM values for each coding transcript using only the coding subset? Is there a difference between this approach or running Kallisto only on those coding transcripts?

Not an expert but my guess is no you would not need to rescale just because you are looking at a subset. What if later you want to compare against non-coding...then you'd have to do it all again.

Mmmm.. that's true, but I'm not interested in the non-coding transcripts at all. That's why I have posted a second question "Is there a difference between this approach or running Kallisto only on those coding transcripts?", because if you use only coding set of transcripts when running kallisto, the denominator in the TPM calculation would vary.

The point of RPKM/FPKM/TPM is normalisation i.e. to make different regions of the genome comparable. Whether you include/exclude any region (e.g. non-coding transcripts) in the normalisation calculation is not important if you are not going to use them at all anyway (many studies exclude other regions like blacklisted regions, mitochondrial regions etc.) By excluding regions, the raw TPM values simply will be a greater proportion of the whole, but the relationship between those values (i.e. fold change) will remain the same.