Question: Getting gene sequence from ncbi BioJava
0
gravatar for Bioaln
4.9 years ago by
Bioaln310
France
Bioaln310 wrote:

Hello. I've been dealing with sequence parsing lately and I can't seem to download a gene sequence from NCBI. My previous code returns me the gene name (for example TGFB1). So again, what I am trying to accomplish here is use java code to fetch gene sequence (I've been tying with geneRICH class in BioJava but it doesn't seem to have that option, only accession number and genbank id).

 

Thanks for any help.

identifiers biojava • 2.0k views
ADD COMMENTlink modified 4.9 years ago by Pierre Lindenbaum122k • written 4.9 years ago by Bioaln310
1
gravatar for Pierre Lindenbaum
4.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

you need to find the sequences associated to this gene (e.g : refseq sequences) using NCBI utilities. e.g: Get Fasta File With Protein Sequences Given Entrez Gene Ids

 

furthermore, Biojava is not really needed to fetch the sequence. You can use xjc to generate the classes

xjc -dtd "http://www.ncbi.nlm.nih.gov/dtd/NCBI_TSeq.dtd"
parsing a schema...
compiling a schema...
generated/ObjectFactory.java
generated/TSeq.java
generated/TSeqSeqtype.java
generated/TSeqSet.java

 

and use those classes to parse a ncbi EUtilities efetch URL

see http://plindenbaum.blogspot.fr/2006/12/java-16-mustang-jaxb-and.html (old!)

 

 

ADD COMMENTlink written 4.9 years ago by Pierre Lindenbaum122k

Wow, thanks for the thorough answer. I will look into those possibilities.

ADD REPLYlink written 4.9 years ago by Bioaln310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 625 users visited in the last hour