Are reference sequences from GENCODE, Ensembl and RefSeq the same if they are called GRCh38.p10, it is just the annotation that varies?
1
0
Entering edit mode
6.2 years ago

I am looking to compare reference sequences and annotations (.gtf/.gff) to find a best fit for my analysis with regards to mapping. Aligners ask for a reference sequence in .fasta format with an annotation in .gtf or .gff format. Am I correct in assuming that it doesn't matter which data base you download the .fasta file because they are the same (provided they are all e.g. GRCh38.p10) and it is just the annotations which are produced differently? Or does each data base also have it's own reference sequences.

I think they contain the same information but are formatted so they match the .gtf files. Am I right? Help!

RNA-Seq reference genome annotations GRCh38 • 1.5k views
ADD COMMENT
2
Entering edit mode
6.2 years ago
GenoMax 141k

While the reference sequence is identical the identifiers may be different (e.g. chr1 for UCSC instead of 1 for Ensembl). This can mess things up if you mix and match sequence/annotation from UCSC/Ensembl.

.p10 is the patch version. Patches never change chromosomal coordinates so keep that in mind.

ADD COMMENT

Login before adding your answer.

Traffic: 2326 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6