Unzipped output files: fasta files in python
0
0
Entering edit mode
7 weeks ago
Debut ▴ 10

I have a script in python that allows me to download sequences of a species that the user enters the name "input" then his sequences are placed directly in a folder that takes the name of the species. But in zipped format (_genomic.fna.gz or _protein.faa.gz). I would like to add a function that allows to zip directly the output files (because it's an important number for example for klebsiella almost 14 000 sequences). Si quelqu'un peut m'aider s'il vous plaît( en rajoutant . here is my code :

import os

species = input("Bacteria species ? : ")
TypeSeq= input ("fna ? or faa ? :")

species = input("Bacteria species ? : ")
TypeSeq= input ("fna  ? ou faa  ?")

if data["#Organism/Name"].str.contains(species, case = False).any():

print(data.loc[data["#Organism/Name"].str.contains(species, case = False)]['Status'].value_counts())
FTP_list = data.loc[data["#Organism/Name"].str.contains(species, case = False)]["FTP Path"].values

if  TypeSeq == "faa" :

if not os.path.exists(species):
os.makedirs(species)

for url in FTP_list:
try :
parts = urllib.parse.urlparse(url)
parts.path
posixpath.basename(parts.path)
suffix = "_protein.faa.gz"
prefix = posixpath.basename(parts.path)
print(prefix+suffix)

path = posixpath.join(parts.path, prefix+suffix)
ret = parts._replace(path=path)

except :
print ("")

fasta python • 147 views
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (text becomes text), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.

0
Entering edit mode

Okay, thank you very much.