Question: What is the difference between refrence genome and annotation ?
0
gravatar for Y Tb
5.6 years ago by
Y Tb200
USA
Y Tb200 wrote:

I am very new in bioinformatics field and I want to know what is the different between reference genome file and annotation file. Also what is the best website that I can download these files from it for human.

rna-seq next-gen • 4.2k views
ADD COMMENTlink modified 5.6 years ago by Xingyu Yang260 • written 5.6 years ago by Y Tb200
1
gravatar for Xingyu Yang
5.6 years ago by
Xingyu Yang260
Atlanta
Xingyu Yang260 wrote:

Reference genome file is a description of the genome sequence. And annotation file is a description of where genetic element(intron, exon) located in the genome, in the form begin and end coordinate. Reference genome file are mostly in .fasta format and annotation are mostly in .gff or .bed format. Another format .genbank sometime contain both reference and annotation information. Google each format for details.

For human, the best way to download that file is http://genome.ucsc.edu/. You can also download it from ncbi.  

ADD COMMENTlink written 5.6 years ago by Xingyu Yang260

Thanks Xingyu Yang, so what is the difference between GTF AND GFF annotation format.

ADD REPLYlink written 5.6 years ago by Y Tb200

They are pretty similar. GTF refers to version 2 of GFF (the most recent version is GFF3). 

ADD REPLYlink written 5.6 years ago by Xingyu Yang260

Thanks again  Xingyu Yang, Could you please send me the direct link to download the human annotation file, and what about annotation file from Ensembl website. 

ADD REPLYlink written 5.6 years ago by Y Tb200

If you want a direct link, I would recommend you download it here:http://cufflinks.cbcb.umd.edu/igenomes.html

ADD REPLYlink written 5.6 years ago by Xingyu Yang260

I follow the link, and I found that 

  Ensembl GRCh37 17297 MB May 14 17:23

So is the human annotation file is about 17.9 GB

ADD REPLYlink written 5.6 years ago by Y Tb200

It include everything. Like different format of annotation, annotation of ncRNA, reference sequence, indexed reference sequences. 

If you just want the annotation file, find it on ncbi ftp:ftp://ftp.ncbi.nih.gov/genomes/Homo_sapiens. Annotation file are in the GFF folder. The annotation file include ncRNA

ADD REPLYlink written 5.6 years ago by Xingyu Yang260

I visited this link but it confused me because there are many files there so which of them is the annotation file for human

ADD REPLYlink written 5.6 years ago by Y Tb200
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1520 users visited in the last hour