Question: Find Corresponding GenBank ID
0
gravatar for Penny Liu
13 months ago by
Penny Liu30
Taiwan
Penny Liu30 wrote:

This is what I have.

Is it possible to get corresponding GenBank ID via BioSample or BioProject Accession? It is great if you guys could give me some valuable suggestions and comments how to solve.

Thanks

ADD COMMENTlink written 13 months ago by Penny Liu30

Are you referring to NZ* accession numbers or WP* Protein ID's as described on this summary annotation page?

ADD REPLYlink written 13 months ago by genomax65k

Thanks, but it's not what I expected. For example, https://www.ncbi.nlm.nih.gov/protein/AAC76128.2

BioProject: PRJNA225, BioSample: SAMN02604091. Both of these should refer to the GenBank accession AAC76128.

ADD REPLYlink modified 13 months ago • written 13 months ago by Penny Liu30

I am not sure I understand your intent. You want to associate a biosample (which is whole genome sequence) to a specific protein?

ADD REPLYlink modified 13 months ago • written 13 months ago by genomax65k

I was assigned BioProject and BioSample accession number when I take WGS submission. Will I get the GenBank ID when the annotation process is finished? Perhaps I misunderstood and thanks again!

ADD REPLYlink written 13 months ago by Penny Liu30

Referring back to the original example isn't GCA_002318995.1the accession that has been assigned to your genome? Looks like yours is one of the 8 assemblies available for this organism.

ADD REPLYlink written 13 months ago by genomax65k

In my submission, the strain is KCT instead of YHL.

ADD REPLYlink written 13 months ago by Penny Liu30

Classification is at the taxid level so it looks like all strains of the species are collected under main taxid (Shewanella algae (taxid:38313)).

ADD REPLYlink modified 13 months ago • written 13 months ago by genomax65k

As far as I know, Shewanella haliotis and Shewanella algae are the same genus but different species.

ADD REPLYlink modified 13 months ago • written 13 months ago by Penny Liu30

I did this as a part of one of my project. This could be helpful

library("genomes")
proks <- reports("prokaryotes.txt")

# BioProject Accession: PRJNA312015
biopro <- proks[grep("312015",proks$`BioProject ID`),]
ftp_biopro <- biopro$`FTP Path`


file_type <- "feature_table.txt"
GCA <- tail(unlist(strsplit(ftp_biopro, "/")),1)
file_type <- paste0(GCA,"_",paste0(file_type, ".gz"))
down_file <- paste(ftp_biopro,file_type, sep ="/")
dest <- "weather.op.gz" 
download.file(url = down_file, destfile = dest) 
my_data <- read.table(dest, header=F, sep = "\t")
ADD REPLYlink written 13 months ago by Tanvir Ahamed 270
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 774 users visited in the last hour