Question: Chromosome Names in genome are incompatible with annotations
0
gravatar for serpalma.v
7 months ago by
serpalma.v10
Germany
serpalma.v10 wrote:

Dear community,

while creating an index for the bovine genome with STAR, the process fails because the chromosome names in the annotation file (Bos_taurus.UMD3.1.87.gtf) are incompatible with the ones in the reference file (UMD3.1_chromosomes.fa) (e.g. for chromosome "10" vs "gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1", both should be "10").

Apparently, the solution is to change the names in the reference file. Could you suggest a tool that does this for me or a "one liner" that can transform the names into the chromosome number?

And also, would this affect downstream processing of my results?

I have searched through other threads and couldn't find a better answer than the one given here: Renaming Entries In A Fasta File But it renames chromosomes names in the reference file based on the order they appear.

Cheers!

star alignment • 225 views
ADD COMMENTlink written 7 months ago by serpalma.v10
sed 's/gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1/10/'
ADD REPLYlink written 7 months ago by Pierre Lindenbaum99k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 746 users visited in the last hour