Question: Chromosome Names in genome are incompatible with annotations
7 months ago by
serpalma.v10 wrote:

Dear community,

while creating an index for the bovine genome with STAR, the process fails because the chromosome names in the annotation file (Bos_taurus.UMD3.1.87.gtf) are incompatible with the ones in the reference file (UMD3.1_chromosomes.fa) (e.g. for chromosome "10" vs "gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1", both should be "10").

Apparently, the solution is to change the names in the reference file. Could you suggest a tool that does this for me or a "one liner" that can transform the names into the chromosome number?

And also, would this affect downstream processing of my results?

I have searched through other threads and couldn't find a better answer than the one given here: Renaming Entries In A Fasta File But it renames chromosomes names in the reference file based on the order they appear.


star alignment • 225 views
written 7 months ago by serpalma.v10
sed 's/gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1/10/'
written 7 months ago by Pierre Lindenbaum99k
