I have a list of bioproject IDs and would like to get corresponding sequences from them. So, I am following a list of steps as below:
1. Using the bioproject ID, I am getting GI ID using elink:
handle = Entrez.elink(dbfrom="bioproject", db="nuccore",id=bioprojecID, linkname="bioproject_nuccore_wgsmaster") record = Entrez.read(handle) GI_ID = record["LinkSetDb"]["Link"]["Id"]
2. Then I am trying to get sequence from GI_ID (using efetch and seqIO modules in biopython):
handle = Entrez.efetch(db="nucleotide", id=GI_ID, rettype="gb", retmode="text") record = SeqIO.read(handle, "genbank")
But this gives unknown sequence when trying to print record.
Can anyone advise if this is the right way to do it or is there a better way to obtain related sequences from bioproject IDs ?
Thanks in advance !