Entering edit mode
6.9 years ago
elisheva
▴
120
Hello everyone I have a problem with my code on python. The goal is to write a text file of the name of organisms which has cdna file. The problem is that I have a file that looks like this:
./anas_platyrhynchos/pep:
-rwxrwxr-x 1 ftp ftp 10561439 May 07 11:02 Anas_platyrhynchos.BGI_duck_1.0.pep.abinitio.fa.gz
-rwxrwxr-x 1 ftp ftp 5762174 May 07 10:34 Anas_platyrhynchos.BGI_duck_1.0.pep.all.fa.gz
-rwxrwxr-x 1 ftp ftp 140 May 09 16:17 CHECKSUMS
-rwxrwxr-x 1 ftp ftp 3065 May 07 11:02 README
./ancestral_alleles:
-rwxrwxr-x 1 ftp ftp 590904827 May 16 09:58 callithrix_jacchus_ancestor_C_jacchus3.2.1_e86.tar.gz
-rwxrwxr-x 1 ftp ftp 848229938 May 16 09:59 chlorocebus_sabaeus_ancestor_ChlSab1.1_e86.tar.gz
-rwxrwxr-x 1 ftp ftp 767402743 May 16 09:59 gorilla_gorilla_ancestor_gorGor3.1_e86.tar.gz
-rwxrwxr-x 1 ftp ftp 845762805 May 16 09:58 homo_sapiens_ancestor_GRCh38_e86.tar.gz
-rwxrwxr-x 1 ftp ftp 791833177 May 16 09:59 macaca_mulatta_ancestor_Mmul_8.0.1_e86.tar.gz
-rwxrwxr-x 1 ftp ftp 812588425 May 16 09:59 pan_troglodytes_ancestor_CHIMP2.1.4_e86.tar.gz
-rwxrwxr-x 1 ftp ftp 804459736 May 16 09:59 papio_anubis_ancestor_PapAnu2.0_e86.tar.gz
-rwxrwxr-x 1 ftp ftp 799305351 May 16 09:59 pongo_abelii_ancestor_PPYG2_e86.tar.gz
./anolis_carolinensis:
drwxrwxr-x 2 ftp ftp 4096 May 09 16:17 cdna
drwxrwxr-x 2 ftp ftp 4096 May 09 16:17 cds
drwxrwxr-x 2 ftp ftp 8192 May 09 16:17 dna
drwxrwxr-x 2 ftp ftp 4096 May 09 16:17 dna_index
drwxrwxr-x 2 ftp ftp 4096 May 09 16:17 ncrna
drwxrwxr-x 2 ftp ftp 4096 May 09 16:17 pep
./anolis_carolinensis/cdna:
-rwxrwxr-x 1 ftp ftp 43505585 May 07 22:42 Anolis_carolinensis.AnoCar2.0.cdna.abinitio.fa.gz
-rwxrwxr-x 1 ftp ftp 16313984 May 07 22:21 Anolis_carolinensis.AnoCar2.0.cdna.all.fa.gz
-rwxrwxr-x 1 ftp ftp 138 May 09 16:17 CHECKSUMS
-rwxrwxr-x 1 ftp ftp 3153 May 07 22:42 README
./anolis_carolinensis/cds:
-rwxrwxr-x 1 ftp ftp 10413036 May 07 22:21 Anolis_carolinensis.AnoCar2.0.cds.all.fa.gz
-rwxrwxr-x 1 ftp ftp 75 May 09 16:17 CHECKSUMS
-rwxrwxr-x 1 ftp ftp 2501 May 07 22:42 README
So for this example I have to write only anolis_carolinensis.
Here is my code so far:
def separate(s):
"""This function gets a text file withe some data and "sifts" only the
require information"""
index = 0 #Initialize the index of the text.
while(True):
if not s: #If the file is over.
break
current = s.find("cdna.all.fa.gz",index) #The "location of that particular string"
if current > -1: #If indeed the "cdna..." was found
root = s.rfind("./",0,current) #Goes back to find the organism.
organ.write(s[root+2: s.find(":",root)]+"/\n") #Writes the organisms into a text file.
index = current #Updates the index to the current position.
s = s[index:] #Promotes the text
else: #If there is no more "cdna...".
break
data = open('Content.txt','r') #The file of all the data.
organ = open ('ftp.txt','w') #A file for the "ftp"
s = data.read()
separate(s)
data.close()
organ.close()
The problem is that it doesn't write all the organisms - it skips some of them...
I guess it's because of the index, I'v tried to change it but not successfully.
Thanks!!