Question: Chromosome Names in genome are incompatible with annotations
gravatar for serpalma.v
3.7 years ago by
serpalma.v60 wrote:

Dear community,

while creating an index for the bovine genome with STAR, the process fails because the chromosome names in the annotation file (Bos_taurus.UMD3.1.87.gtf) are incompatible with the ones in the reference file (UMD3.1_chromosomes.fa) (e.g. for chromosome "10" vs "gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1", both should be "10").

Apparently, the solution is to change the names in the reference file. Could you suggest a tool that does this for me or a "one liner" that can transform the names into the chromosome number?

And also, would this affect downstream processing of my results?

I have searched through other threads and couldn't find a better answer than the one given here: Renaming Entries In A Fasta File But it renames chromosomes names in the reference file based on the order they appear.


star alignment • 889 views
ADD COMMENTlink written 3.7 years ago by serpalma.v60
sed 's/gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1/10/'
ADD REPLYlink written 3.7 years ago by Pierre Lindenbaum131k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2283 users visited in the last hour