Forum: Ensembl file formatting tool: tell us what you need
gravatar for Emily_Ensembl
4.7 years ago by
Emily_Ensembl21k wrote:

We are looking for feedback on a new Ensembl tool being developed to help researchers download the reference files they need in the right format directly from Ensembl.

We understand there's slightly different formatting needed by different tools, or even sometimes you need identifiers remapped to make datasets match. An example of that would be EMBL chromosome names (1, 2, 3...) and UCSC chromosome names (chr1, chr2, chr3...). For some analyses N padding in a chromosome, for others it might cause issues.

So we're creating a tool that can help give you the datasets you need, in the format you need, so you can spend less time preparing the reference sets and get down to running your analysis. For example, NCBI has a number of premade datasets, with different combinations of regions, and with prepared indexes for common tools:

The first step in this project is we want to hear from you on what filtering and transformations you do to our datasets to make them useful for your analysis. Or what changes to our datasets would make them easier for you to run your analysis faster. Everything from identifier types, to extra attributes needed, what combinations of regions in a reference set (patches, haplotypes, scaffolds, etc) to masking and filtering of regions.

Once we have a list of how our users use our data, and what programs they're trying to use it with, we can start this initiative to make our datasets and tools better adapted to your needs. We also hope we'll be able to follow up with anyone replying in case we need some clarification to better understand your needs.

Thank you to everyone, we're committed to making our reference data better fit your analysis needs. 

ADD COMMENTlink written 4.7 years ago by Emily_Ensembl21k

Do you want people replying here or do you have a central email address/site that comments should be sent to?

ADD REPLYlink written 4.7 years ago by Devon Ryan97k

Here is fine.

ADD REPLYlink written 4.7 years ago by Emily_Ensembl21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2071 users visited in the last hour