I have ~450 genomic regions of 250kb each in hg38.
I wanted to do GO term analysis of genes in these regions.
In python, I used
gencode.v38.basic.annotation.gff3.gz filtered on feature='gene'. I get ~30000 genes.
Using Panther Classification System, I get ~8000 hits.
If I use the UCSC genome browser, with track NCBI Refseq track and Refseq(all) table, I get only ~1800 genes.
What is the best way to extract genes for GO term analysis?