Question: How to obtain the chromosome out of an accession number?
0
gravatar for eidriangm
10 months ago by
eidriangm0
eidriangm0 wrote:

Hello Community.

My problem is the following, I have some bed files whose genomic regions are annotated using the chromosome (chr__ start end ... ...), and I want to use the ncbi gff3 to extract the info but this file is annotated using accession.version numbers. Bedtools oblige me to use the same location nomencaluture thus I need to transform the accession to chr base.

So far I know that the number of the "NC_" prefixed accessions id specify the chromosme, (i.e: NC_000001.11: chr1, NC_000002.12: chr2, ..., NC_000023.11: chrX, NC_000024.10:chrY, NC_012920.1: chrM ). Nevertheless, how can I know which is the chromosome of the accessions prefixed with NW_ or NT_?

Some "NT_ , NW_" are alternative assemblies of NC_ and the info contained is "the same" being placed lines below that NC_, but some others do not and contains genes of interest which I could be loosing when using bedtools i.e https://www.ncbi.nlm.nih.gov/gene/3806. Some do not have a known location but that gene is known to be in the chromosome 19 and I can not deduce it from its accession number.

Is there a way of getting the chromosome from the accession number? Or shall I extract the info from another annotation file?

Thanks

ADD COMMENTlink modified 10 months ago • written 10 months ago by eidriangm0

Have you tried potential way(s) of linking chromosomes to accession number mentioned in this post: How to get the chromosome numbers from RefSeq accession IDs ?

ADD REPLYlink modified 10 months ago • written 10 months ago by Sej Modha4.1k

I saw it but all the links provided there are not working and the answer with awk + sed only applies with NC_ (already under control). Thanks anyway

ADD REPLYlink written 10 months ago by eidriangm0

you may want to give some example data and expected output.

ADD REPLYlink written 10 months ago by cpad011211k

Well that is already given in the the question, with the Entrez ID gene 3806, which is annotated in the accession NT_113949 and I want to obtain the chromosome which is number 19. I could look for more examples but the idea is basically that, from an accession number prefixed with NT_ NW_ obtain its chromose if it is known.

ADD REPLYlink modified 10 months ago • written 10 months ago by eidriangm0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2285 users visited in the last hour