I have ~60 WGS and RNA-Seq samples which should be sufficient to analyze cis-eQTLs, however not sufficient for the trans-eQTLs. I am wondering would it be reasonable to preselect variants from vcfs that fall into certain genomic regions, as I am really interested in a particular GWAS category. So after selecting lead plus LD GWAS variants or lead GWAS variants plus all variants in some +/- window, and performing Matrix eQTL, the corrected p-values would become more significant as I am performing less tests, thus I would I get more significant cis and trans eQTLs? Is this statistically correct to do?
Too bad I only saw this now. Yes, this is a perfectly reasonable approach to take. It may be helpful to also look at (matched or unmatched) portions of the genome that you expect to be irrelevant in order to assess the distribution of your test statistics.
Having said that, there is nothing wrong with restricting the data that you will analyze to relevant regions.
Also, it may benefit you to know that this type of logic has a name and is widely used for further research if you like. It is called extrinsic filtering of your data, which just means that you are using outside data (in this case prior GWAS results) to filter or trim down your data. You also describe an intrinsic filtering mechanism, when you describe limiting your search to variants in LD. This is intrinsic because it represents further filtering of your data based calculations generated from the data itself.