I'm trying to retreive all human membrane protein coding genes in Uniprot. I made this query :
locations:(location:"Membrane [SL-0162]") AND organism:"Homo sapiens (Human) "
But when I'm looking (with a simple unique function in python) at the list of related genes ("Gene names (primary)") I get as many genes as proteins (e.i. 37 557).
That is not logical since human genome is approx 23 000 genes long and that membrane protein coding genes are estimated to represent 20% of it.
Can anyone see what is going on here ?
Thank you for these hints ! I'll look deeper at HUGO database. Meanwhile, changing for "distinct" function, I found many "nan" not considered as NaN in my list. With that in addition to your answer, I should obtain a proper list.