4 months ago by
You have run into the problem that in the human genome there are instances of gene_names which are associated with multiple genomic loci (RF0019 in the link you posted). Since they are associated with different loci they also have different gene_ids. Last time I checked there were ~100 such gene_names in the human genome - many of which are located on different chromosomes.
I would always analyze the data with gene_ids (!) simply because else you assume the different loci produce identical products which might or might not be the case. Furthermore the gene_id analysis lets you analyze different things such as regulation and isoform switches. Lastly if you want to do any downstream analysis (go-terms or gene-set enrichment analysis etc) you should NEVER use gene_names. The problem is that in many cases gene_names are to unspecific with many different gene names pointing to the same gene and multiple genes all pointed to by a single gene name.