How can you find out the coordinates of the PAR1 and PAR2 regions on chromosomes X and Y in the Ensembl human reference genome? As far as I can tell, these regions are masked with Ns on Ensembl (but not UCSC).
How can you find out the coordinates of the PAR1 and PAR2 regions on chromosomes X and Y in the Ensembl human reference genome? As far as I can tell, these regions are masked with Ns on Ensembl (but not UCSC).
In Ensembl, PAR regions are stored as Bio::EnsEMBL::AssemblyExceptionFeatures
. You can fetch these via a Bio::EnsEMBL::DBSQL::AssemblyExceptionFeatureAdaptor
.
# get DB adaptor $dba by usual means
my $aefa = $dba->get_AssemblyExceptionFeatureAdaptor();
# get $slice e.g. per chromosome
my @aefs = @{$aefa->fetch_all_by_Slice($slice)};
AssemblyExceptionFeatures
also include haplotypes, so you may need to filter the features.
There is a public Ensembl MySQL database at http://www.ensembl.org/info/data/mysql.html and file downloads at http://www.ensembl.org/info/data/ftp/index.html . If you use the MySQL instance, you will need to understand the database schema (which is documented). The Perl API hides the schema details.
If you're just doing one-off queries, why not just go directly to Ensembl? For example, for my favorite test gene CDK2:
http://uswest.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000123374
Clearly shows the genomic position at the top:
Chromosome 12: 56,360,556-56,366,565 forward strand
Now, the hard part (for me) is knowing the Ensembl gene IDs for PAR1 and PAR2, since PAR2 is apparently not an official gene human gene symbol, and the PAR1 in Entrez Gene doesn't have a match in Ensembl. But presumably you can find your Ensembl Gene ID more easily?
(Incidentally, NCBI Entrez Gene also will give genomic coordinates, and they use the same assembly as Ensembl...)
Thanks Andrew, but when I said PAR1 and PAR2 I was referring to the Pseudo-Autosomal Regions rather than genes (http://en.wikipedia.org/wiki/Pseudoautosomal_region). Apologies for being unclear!
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks to the folks below for answers. The PAR regions (as well as haplotypes) are defined in the assembly_exception table of the Ensembl human core mysql database.