Question: Where are the mammoth's ORFs?
1
gravatar for cdsouthan
3.5 years ago by
cdsouthan1.8k
cdsouthan1.8k wrote:

Not sure if anyone from the Swedish Museum of Natural History is on this forum but does anyone know any plans to process the bam files from http://www.ebi.ac.uk/ena/data/view/ERP008929  into something we can actually use for looking at protein evolution?   Might Ensembl eventualy pick up the data for their pipeline Emily_Ensembl ?  and/or the NCBI ?   This is not the first time journal editors allow a new genome paper without the genome in question being in any usable form for biologists

"Complete Genomes Reveal Signatures of Demographic and Genetic Declines in the Woolly Mammoth"

http://www.citeulike.org/user/cdsouthan/article/13590852

 

 

ADD COMMENTlink modified 3.5 years ago by RL Rogers0 • written 3.5 years ago by cdsouthan1.8k

Well, the fastq files are right there so there's nothing stopping your from doing a quick assembly and ORF prediction. Alternatively you could convert the bam files to fasta (I assume it's the assembly) and then predict them ORFs..

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by 5heikki7.8k

Sure, there are many on here who could do this like rolling of a log (but also get high ORF errors) but I'm unfortunately not one of them (can you tell if they are decent assemblies ?).  The point is that substancial scientific value of the whole exersise is lost until it does get a full gene build that Ensembl Compara can crunch

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by cdsouthan1.8k

Well, they probably had their reasons for not submitting the assemblies and predicted proteins as such. Now the assembly is (probably) nicely hidden from the majority of biologists in the bam file. I haven't checked the paper, but if they so much as mentioned a mammoth protein in it, then obviously also their ORF predictions should have been submitted..

ADD REPLYlink written 3.5 years ago by 5heikki7.8k

That'll be the next paper then.....  Genome Res ?

Spoke too soon - they cranked the 2nd out already

http://biorxiv.org/content/early/2015/04/23/018366.article-info

but still sans-ORFs 

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by cdsouthan1.8k
2
gravatar for Emily_Ensembl
3.5 years ago by
Emily_Ensembl16k
EMBL-EBI
Emily_Ensembl16k wrote:

Unfortunately this genome is not a proper assembly, just a read library that is not suitable for annotation. If it were suitable for annotation, we still could not annotate because there is not a suitable gene set to annotate with. Being extinct, mammoths have no active transcription, so there is no mammoth cDNA or protein set, and the elephant gene set is already a low quality gene set that was projected from other species, so would not be good enough for this analysis.

ADD COMMENTlink written 3.5 years ago by Emily_Ensembl16k

Thanks,  its interesting I overlooked your point on eventual transcript support issue for ancient genomes. Notwithstanding,  it would be useful to get at least the more solid ORFs into TrEMBL somehow, but I guess this is predecated on a better Elephant assembly  (hardly sample-limited one would have thought....)

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by cdsouthan1.8k
0
gravatar for RL Rogers
3.5 years ago by
RL Rogers0
RL Rogers0 wrote:

You can putatively use the elephant GTF file of annotations, with the major caveat that genes and gene boundaries may have changed.  The mammoth assembly is mapped onto Loxodonta and will miss DNA specific to mammoths.   If you want to find elephant genes present in mammoth, this is relatively easy.  If you want to find mammoth genes missing in elephant, this will be much less tractable. 

ADD COMMENTlink written 3.5 years ago by RL Rogers0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 756 users visited in the last hour