Ensembl Multiple Genome alignments sequence genome coordinate
0
0
Entering edit mode
5.3 years ago
Sidi.Ma • 0

Hi, I have tried to use whole genome multiple alignment data in Ensembl at ftp://ftp.ensembl.org/pub/current_emf/ensembl-compara/multiple_alignments/12_primates.epo/ to get history mutation information. But when I downloaded the data at this site, I found a problem. the data format like this:

##FORMAT (compara)
##DATE Thu May 10 10:06:16 2018
##RELEASE 92
# Alignments: 12 primates EPO
DB CONNECT    : homo_sapiens_core_92_38
# Region: Homo sapiens chromosome:GRCh38:15:1:101991189:1
DB DISCONNECT : homo_sapiens_core_92_38
# File 5

SEQ homo_sapiens 15 74635419 74640448 -1 (chr_length=101991189)
DB CONNECT    : homo_sapiens_core_92_38
SEQ ancestral_sequences Ancestor_1134_130880 1 5020 1 (chr_length=5020)
SEQ pan_troglodytes 15 55631239 55636258 -1 (chr_length=83230942)
SEQ ancestral_sequences Ancestor_1134_130879 1 5013 1 (chr_length=5013)
SEQ gorilla_gorilla 15 53355233 53360259 -1 (chr_length=80890724)
SEQ ancestral_sequences Ancestor_1134_130878 1 4322 1 (chr_length=4322)
SEQ callithrix_jacchus NTIC01038834.1 130674755 130679467 1 (chr_length=131709439)
TREE (((Hsap_15_74635419_74640448[-]:0.006609454545459997,Ptro_15_55631239_55636258[-]:0.006768113022119993)Aseq_Ancestor_1134_130880_1_5020[+]:0.001756882402000004,
Ggor_15_53355233_53360259[-]:0.008676269113150004)Aseq_Ancestor_1134_130879_1_5013[+]:0.03957027048428999,Cjac_NTIC01038834.1_130674755_130679467[+]:0.05366348520229999)
Aseq_Ancestor_1134_130878_1_4322[+]:0.028514765765095013;
ID 11340000280121
DATA
GGGGGGG
GGGGGGG
CCCCCCC
AAAAAAA
GGGGGGA
GGGGGGG

The number behind the "SEQ homo_sapiens 15 " is the sequence start coordinate. When I retrieved these sequences from GRCh38 human genome, I found about a half of sequences of human in DATA blocks were same as genome sequences in intervals denoted by coordinates mentioned in DATA headers but others was different .I have checked whether the sequence was + or - template. But the result had no difference. What wrong with these coordinate? Does they have some other problems used as genome coordinates?

Assembly genome alignment sequence ensembl • 931 views
ADD COMMENT
0
Entering edit mode

Can you please give us some examples of regions where this does not match.

ADD REPLY

Login before adding your answer.

Traffic: 2578 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6