Text file to Phylip format
1
0
Entering edit mode
5.8 years ago
mdsiddra ▴ 30

I am working with text file containing extracted sequences as per required from another file. My resulting text file is of format:

Zebrafish ESLLRFGLRSDLDFRLSLNGKEDLLDTGQSLSSCGVVSGDLISVILPASSQTSSAAHQTHTDQQSSQECVDLQQDCMDQQQQQEQECVCAAAPPLLCCEAEDGLLPLALERLLDSSTCRSPSDCLMLALHLLLLETGFIPQGGAVSSGEMPIGWQAAGVFRLQYVHPLLENSLVSVVAVPMGQTLVINAVLKMETSLENSRKLLLKPDEYVTAWTGGSSGVVYRDLRRLSRLVRDQLVYPLMATARQALGLPLLFGLPVLPPELLLRLLRLLDVRSLVSLSAVCRHLNTATHDASLWRHLLHRDFRVSFPAHRDTDWRELYKQKYRQRAARRGRHWFYPPPISPLIPFPSSPALYPPGIIGDYDQMPILPRPRFHPIGPLPGMSAPV
Fugu ETVLSVGLSAETEISLSLNGSEPLEDTGQTLASCGIVSGDLIRVALIRAADAPDRDDGGGHSEQVSQEAKLPDASGASTDSDQAPGPAASCWEPMLCSETDEGQAPWSLELLYHSAQVSGPGDALVVAANLLMIETGFSPQDSQLKPAEMPAGWRCGGVYKLQYSHRLCGDSVVVMVAVSMGSALIINGLLEVNQSADSVCKLCVDPSSYVTEWPGDSAAAAFKELNKLSRVFKDQVAYPLITAARHAMALPVAFGLTALPPELLLRVFRLLDVRSVVMLSAVCRHFGAITRDTALWRHLYCRDFRDSHAGSRDTDWKEVYRRSYKSRSAVRRSHECFLPPLYPNPRGVFTPPPPVPGIIGEYDQRPILPRPRYDPMSPFPDLDRQP
Chicken RALLAWGYSSDTEFSITLNGKDALTEDEKTLASYGIVPGDLICLLLEETDLPPPSSSPPSLQNGKNGSSLEFPSGLVPEDVDLEEGTGSYPSEPMLCSEAADGEIPHSLEVLYLSAECTSATDALIVLVHLLMMETGYVPQGTEAKAVSMPEKWRGNGVYKLQYTHPLCEEGSAGLTCVPLGDLVAINATLKINREIKGVKRIQLLPASFVCFQEPEKVAGVYKDLQKLSRLFKDQLVYSLLAAARQALNLPDVFGLVVLPLELKLRIFRLLDVRSLISLSAVCRDLYAASNDQLLWRFMYLRDFRDPIARPRDTDWKELYKKKLKQKEALRWRHMFLPPPFHPNPFYPSPFPIYPPMVIGEYGERPSLIPPHFDPIGSLPGANPTL
Zebra SMTENRTAGSDTAFSVTLNRKDALTEDQKTLASYGIVSGDLICLLLEEPDLPPPPATPAPLQNGNNGSSLEFPSGLVPEDADLEEGTGSYPSEPMLCSEAADGETPHSLEMLYLSAECTSATDALIVLVHLLMMETGYVPQGIEAKAVFMPEKWRGNGVYKLQYTHPLCGEGCAGLTCVPLGDLIAINATLKINEEIRSVKRIQLLPSSFVCFQDPEKVAGVYKDLQKLSRLFKDQLVYSLLAAARQALNLPDVFGLLVLPLELKLRIFRLLDVRSLISLSAVCRDLYTASNDQLLWRFMYLRDFRDPIARPRDTDWKELYKKKLKQKEALRWRHMMLLPPFHPNPFYPNPFPIYPPMIIGEYDERPSLIPPHFDPIGSLPGANPML
Anole QALLSWGYSSETKFEITLNNKDSLVGDQDTLASFGIVSGDLICLILEDDASSPSSSLPSSQSNHHSGPSQEFTSEGGPDDLDLQEATGSFPSEPMLCCEATDGQVPHSLQTLYHSAECTNANDALIVSIHLIMMETGYVPQGTEAKASSMPENWRNKGVYKLLYTHPLCENGFAVLTCVPLGNLIVVNAMLKITSDIKSVKRLQLLPTSFICFQDSANVVGVYKDLQKLSRLFKDRLVYPLLAAARQALNLPDVFGLVVLPLELKLRIFRLLDFRSLLSLSAVCHDLYAASNDQLLWRFIYLRDFRDPVARSRDTDWKELYKKKMKQKDALRWRHMMFLPPLHPNPLYPNPFPLYPPMIIGEMDERPSLFPSHLDPFGSFQNPNPTL
Human QSLLTWGYSSNTRFTITLNYKDPLTGDEETLASYGIVSGDLICLILQDDIIPSSTSEHSSLQNNSNGPSQNFEAESIQDNAHMAEGTGFYPSEPMLCSESVEGQVPHSLETLYQSADCSDANDALIVLIHLLMLESGYIPQGTEAKALSMPEKWKLSGVYKLQYMHPLCEGSSATLTCVPLGNLIVVNATLKINNEIRSVKRLQLLPESFICKKLGENVANIYKDLQKLSRLFKDQLVYPLLAFTRQALNLPDVFGLVVLPLELKLRIFRLLDVRSVLSLSAVCRDLFTASNDPLLWRFLYLRDFRDNTVRVQDTDWKELYRKRHIQRESPKGRFVMLLPPFYPNPLHPRPFPRLPPGIIGEYDQRPSLIPPRFDPVGPLPGPNPIL

I need to convert the sequences in this file to phylip format (using python codes), as this:

    14    387

Zebrafish ESLLRFGLRS DLDFRLSLNG KEDLLDTGQS LSSCGVVSGD LISVILPASS 
Fugu      ETVLSVGLSA ETEISLSLNG SEPLEDTGQT LASCGIVSGD LIRVALIRAA 
Chicken   RALLAWGYSS DTEFSITLNG KDALTEDEKT LASYGIVPGD LICLLLEETD 
Zebra     SMTENRTAGS DTAFSVTLNR KDALTEDQKT LASYGIVSGD LICLLLEEPD 
Anole     QALLSWGYSS ETKFEITLNN KDSLVGDQDT LASFGIVSGD LICLILEDDA 
Human     QSLLTWGYSS NTRFTITLNY KDPLTGDEET LASYGIVSGD LICLILQDDI 

          QTSSAAHQTH TDQQSSQECV DLQQDCMDQQ QQQEQECVCA AAPPLLCCEA
          DAPDRDDGGG HSEQVSQEAK LPDASGASTD SDQAPGPAAS CWEPMLCSET
          LPPPSSSPPS LQNGKNGSSL EFPSGLVPED VDLEEGTGSY PSEPMLCSEA
          LPPPPATPAP LQNGNNGSSL EFPSGLVPED ADLEEGTGSY PSEPMLCSEA
          SSPSSSLPSS QSNHHSGPSQ EFTSEGGPDD LDLQEATGSF PSEPMLCCEA
          IPSSTSEHSS LQNNSNGPSQ NFEAESIQDN AHMAEGTGFY PSEPMLCSES

          EDGLLPLALE RLLDSSTCRS PSDCLMLALH LLLLETGFIP QGGAVSSGEM
          DEGQAPWSLE LLYHSAQVSG PGDALVVAAN LLMIETGFSP QDSQLKPAEM
          ADGEIPHSLE VLYLSAECTS ATDALIVLVH LLMMETGYVP QGTEAKAVSM
          ADGETPHSLE MLYLSAECTS ATDALIVLVH LLMMETGYVP QGIEAKAVFM
          TDGQVPHSLQ TLYHSAECTN ANDALIVSIH LIMMETGYVP QGTEAKASSM
          VEGQVPHSLE TLYQSADCSD ANDALIVLIH LLMLESGYIP QGTEAKALSM

          PIGWQAAGVF RLQYVHPLLE NSLVSVVAVP MGQTLVINAV LKMETSLENS
          PAGWRCGGVY KLQYSHRLCG DSVVVMVAVS MGSALIINGL LEVNQSADSV
          PEKWRGNGVY KLQYTHPLCE EGSAGLTCVP LGDLVAINAT LKINREIKGV
          PEKWRGNGVY KLQYTHPLCG EGCAGLTCVP LGDLIAINAT LKINEEIRSV
          PENWRNKGVY KLLYTHPLCE NGFAVLTCVP LGNLIVVNAM LKITSDIKSV
          PEKWKLSGVY KLQYMHPLCE GSSATLTCVP LGNLIVVNAT LKINNEIRSV

          RKLLLKPDEY VTAWTGGSSG VVYRDLRRLS RLVRDQLVYP LMATARQALG
          CKLCVDPSSY VTEWPGDSAA AAFKELNKLS RVFKDQVAYP LITAARHAMA
          KRIQLLPASF VCFQEPEKVA GVYKDLQKLS RLFKDQLVYS LLAAARQALN
          KRIQLLPSSF VCFQDPEKVA GVYKDLQKLS RLFKDQLVYS LLAAARQALN
          KRLQLLPTSF ICFQDSANVV GVYKDLQKLS RLFKDRLVYP LLAAARQALN
          KRLQLLPESF ICKKLGENVA NIYKDLQKLS RLFKDQLVYP LLAFTRQALN

          LPLLFGLPVL PPELLLRLLR LLDVRSLVSL SAVCRHLNTA THDASLWRHL
          LPVAFGLTAL PPELLLRVFR LLDVRSVVML SAVCRHFGAI TRDTALWRHL
          LPDVFGLVVL PLELKLRIFR LLDVRSLISL SAVCRDLYAA SNDQLLWRFM
          LPDVFGLLVL PLELKLRIFR LLDVRSLISL SAVCRDLYTA SNDQLLWRFM
          LPDVFGLVVL PLELKLRIFR LLDFRSLLSL SAVCHDLYAA SNDQLLWRFI
          LPDVFGLVVL PLELKLRIFR LLDVRSVLSL SAVCRDLFTA SNDPLLWRFL

          LHRDFRVSFP AHRDTDWREL YKQKYRQRAA RRGRHWFYPP PISPLIPFPS
          YCRDFRDSHA GSRDTDWKEV YRRSYKSRSA VRRSHECFLP PLYPNPRGVF
          YLRDFRDPIA RPRDTDWKEL YKKKLKQKEA LRWRHMFLPP PFHPNPFYPS
          YLRDFRDPIA RPRDTDWKEL YKKKLKQKEA LRWRHMMLLP PFHPNPFYPN
          YLRDFRDPVA RSRDTDWKEL YKKKMKQKDA LRWRHMMFLP PLHPNPLYPN
          YLRDFRDNTV RVQDTDWKEL YRKRHIQRES PKGRFVMLLP PFYPNPLHPR

          SPALYPPGII GDYDQMPILP RPRFHPIGPL PGMSAPV
          TPPPPVPGII GEYDQRPILP RPRYDPMSPF PDLDRQP
          PFPIYPPMVI GEYGERPSLI PPHFDPIGSL PGANPTL
          PFPIYPPMII GEYDERPSLI PPHFDPIGSL PGANPML
          PFPLYPPMII GEMDERPSLF PSHLDPFGSF QNPNPTL
          PFPRLPPGII GEYDQRPSLI PPRFDPVGPL PGPNPIL

Can I get some guidelines please?

Python • 6.3k views
ADD COMMENT
4
Entering edit mode
5.8 years ago
Joe 21k

You don't really 'convert' to a PHYLIP. It's an alignment format so it's output from alignment tools.

That said, since your sequences are already all the same length, we can pretend your sequences are aligned, so you could try:

1. Convert to a 'normal' format.

# Change to fasta format:

$ sed -e 's/^/>/g' -e 's/ /\n/g' myfile.txt > myfile.fasta

2. Convert formats with BioPython.

from Bio import AlignIO
alignments = AlignIO.parse('myfile.fasta', "fasta")
AlignIO.write(alignments, 'myfile.phy', 'phylip')

Note: this only works because your sequences are already the same length regardless of alignment.

Evidently the sequences are all very alike, so it might be fine, but if you're planning to do something phylogenetic with them, you should align, not just coerce the formats.

Edit:

Here's some full, full-python code, to go from that input file to a Phylip using the approach I described:

import sys
from Bio import AlignIO
import io

with open(sys.argv[1], 'r') as ifh:
    fasta = ''.join('>%s\n%s\n' % (i[0], i[1]) for i in [str.split() for str in ifh] )

iter = AlignIO.parse(io.StringIO(unicode(fasta)), 'fasta')
AlignIO.write(iter, sys.argv[2], "phylip")

Invoke the code as:

$ python txt2phylip.py inputfile.txt outputfile.phylip
ADD COMMENT
0
Entering edit mode

I just need to process the file in a way that the result be a phylip formatted file. I do not need the alignment.

Are you saying that first I have to convert the file to 'fasta' format and then I should convert it to phylip format? Note: I am using windows as OS and not linux.

ADD REPLY
2
Entering edit mode

I think that's the easiest way to do it personally. Trying to write a script which handles all the white spaces and sequence wrapping to go directly from your file to a Phylip strikes me as a nightmare. Phylip is a very strict file format too, meaning you'd have to get exactly the right amount of whitespace in all the right places.

On windows you will probably want to install Cygwin or the Linux Subsystem for Windows to do most of this - bioinformatics is just straight up harder on windows (also that information would probably have been useful with your first post).

If you wanted to, you could replace the sed steps with pure python fairly easily, if you already have python available (you will also need to install BioPython).

ADD REPLY
0
Entering edit mode

Thankyou for the help.

ADD REPLY
0
Entering edit mode

I want to use python code for this purpose, for example, I was trying this:

n = 0
with open('myfile.txt', 'r') as f:
    for line in f:
        n += 1
        print('>' + str(n) + '\n' + line.strip())

but this converts my file to fasta format and not the phylip format. I just need to arrange my sequence in the way it looks as the phylip format and no alignment is required.

Can you help?

ADD REPLY
0
Entering edit mode

Yes, I already told you in my answer, use BioPython and convert the FASTA to a PHYLIP via the AlignIO module.

ADD REPLY
0
Entering edit mode

I've updated my answer with some full code which will go directly from your input file to an output phylip using the approach I described.

ADD REPLY
0
Entering edit mode

Thanks so much for the response, but here is the error I get each time I try using this code:

    with open(sys.argv[1], 'r') as ifh:
IndexError: list index out of range
ADD REPLY
1
Entering edit mode

What command are you using? It looks to me like you aren't passing the files in correctly. Please provide as much information as possible when you're posting else it makes more work for us to try to dig down to the root of the issue.

ADD REPLY
0
Entering edit mode

I am working with windows system and for python codes I am using python Idle 3.6.4, and when I run this script using the files I mentioned above, it displays an error (as I wrote earlier). The command I am using is the only one that is used to run the script and nothing else. I am not using linux, so I am writing the codes in a file.py and then running it to check the output. Since according to my knowledge, argv is the built-in array of linux and so I guess it does not work when I run the script in python IDLE (3.6.4).

ADD REPLY
2
Entering edit mode

Here's what I get using your data and that exact code (with the modification above for python3):

$ cat data.txt
Zebrafish ESLLRFGLRSDLDFRLSLNGKEDLLDTGQSLSSCGVVSGDLISVILPASSQTSSAAHQTHTDQQSSQECVDLQQDCMDQQQQQEQECVCAAAPPLLCCEAEDGLLPLALERLLDSSTCRSPSDCLMLALHLLLLETGFIPQGGAVSSGEMPIGWQAAGVFRLQYVHPLLENSLVSVVAVPMGQTLVINAVLKMETSLENSRKLLLKPDEYVTAWTGGSSGVVYRDLRRLSRLVRDQLVYPLMATARQALGLPLLFGLPVLPPELLLRLLRLLDVRSLVSLSAVCRHLNTATHDASLWRHLLHRDFRVSFPAHRDTDWRELYKQKYRQRAARRGRHWFYPPPISPLIPFPSSPALYPPGIIGDYDQMPILPRPRFHPIGPLPGMSAPV
Fugu ETVLSVGLSAETEISLSLNGSEPLEDTGQTLASCGIVSGDLIRVALIRAADAPDRDDGGGHSEQVSQEAKLPDASGASTDSDQAPGPAASCWEPMLCSETDEGQAPWSLELLYHSAQVSGPGDALVVAANLLMIETGFSPQDSQLKPAEMPAGWRCGGVYKLQYSHRLCGDSVVVMVAVSMGSALIINGLLEVNQSADSVCKLCVDPSSYVTEWPGDSAAAAFKELNKLSRVFKDQVAYPLITAARHAMALPVAFGLTALPPELLLRVFRLLDVRSVVMLSAVCRHFGAITRDTALWRHLYCRDFRDSHAGSRDTDWKEVYRRSYKSRSAVRRSHECFLPPLYPNPRGVFTPPPPVPGIIGEYDQRPILPRPRYDPMSPFPDLDRQP
Chicken RALLAWGYSSDTEFSITLNGKDALTEDEKTLASYGIVPGDLICLLLEETDLPPPSSSPPSLQNGKNGSSLEFPSGLVPEDVDLEEGTGSYPSEPMLCSEAADGEIPHSLEVLYLSAECTSATDALIVLVHLLMMETGYVPQGTEAKAVSMPEKWRGNGVYKLQYTHPLCEEGSAGLTCVPLGDLVAINATLKINREIKGVKRIQLLPASFVCFQEPEKVAGVYKDLQKLSRLFKDQLVYSLLAAARQALNLPDVFGLVVLPLELKLRIFRLLDVRSLISLSAVCRDLYAASNDQLLWRFMYLRDFRDPIARPRDTDWKELYKKKLKQKEALRWRHMFLPPPFHPNPFYPSPFPIYPPMVIGEYGERPSLIPPHFDPIGSLPGANPTL
Zebra SMTENRTAGSDTAFSVTLNRKDALTEDQKTLASYGIVSGDLICLLLEEPDLPPPPATPAPLQNGNNGSSLEFPSGLVPEDADLEEGTGSYPSEPMLCSEAADGETPHSLEMLYLSAECTSATDALIVLVHLLMMETGYVPQGIEAKAVFMPEKWRGNGVYKLQYTHPLCGEGCAGLTCVPLGDLIAINATLKINEEIRSVKRIQLLPSSFVCFQDPEKVAGVYKDLQKLSRLFKDQLVYSLLAAARQALNLPDVFGLLVLPLELKLRIFRLLDVRSLISLSAVCRDLYTASNDQLLWRFMYLRDFRDPIARPRDTDWKELYKKKLKQKEALRWRHMMLLPPFHPNPFYPNPFPIYPPMIIGEYDERPSLIPPHFDPIGSLPGANPML
Anole QALLSWGYSSETKFEITLNNKDSLVGDQDTLASFGIVSGDLICLILEDDASSPSSSLPSSQSNHHSGPSQEFTSEGGPDDLDLQEATGSFPSEPMLCCEATDGQVPHSLQTLYHSAECTNANDALIVSIHLIMMETGYVPQGTEAKASSMPENWRNKGVYKLLYTHPLCENGFAVLTCVPLGNLIVVNAMLKITSDIKSVKRLQLLPTSFICFQDSANVVGVYKDLQKLSRLFKDRLVYPLLAAARQALNLPDVFGLVVLPLELKLRIFRLLDFRSLLSLSAVCHDLYAASNDQLLWRFIYLRDFRDPVARSRDTDWKELYKKKMKQKDALRWRHMMFLPPLHPNPLYPNPFPLYPPMIIGEMDERPSLFPSHLDPFGSFQNPNPTL
Human QSLLTWGYSSNTRFTITLNYKDPLTGDEETLASYGIVSGDLICLILQDDIIPSSTSEHSSLQNNSNGPSQNFEAESIQDNAHMAEGTGFYPSEPMLCSESVEGQVPHSLETLYQSADCSDANDALIVLIHLLMLESGYIPQGTEAKALSMPEKWKLSGVYKLQYMHPLCEGSSATLTCVPLGNLIVVNATLKINNEIRSVKRLQLLPESFICKKLGENVANIYKDLQKLSRLFKDQLVYPLLAFTRQALNLPDVFGLVVLPLELKLRIFRLLDVRSVLSLSAVCRDLFTASNDPLLWRFLYLRDFRDNTVRVQDTDWKELYRKRHIQRESPKGRFVMLLPPFYPNPLHPRPFPRLPPGIIGEYDQRPSLIPPRFDPVGPLPGPNPIL

Command

# My python version is 3.6.0

$ python3 txt2phylip.py data.txt output.phylip

Output:

$ cat output.phylip

 6 387
Zebrafish  ESLLRFGLRS DLDFRLSLNG KEDLLDTGQS LSSCGVVSGD LISVILPASS
Fugu       ETVLSVGLSA ETEISLSLNG SEPLEDTGQT LASCGIVSGD LIRVALIRAA
Chicken    RALLAWGYSS DTEFSITLNG KDALTEDEKT LASYGIVPGD LICLLLEETD
Zebra      SMTENRTAGS DTAFSVTLNR KDALTEDQKT LASYGIVSGD LICLLLEEPD
Anole      QALLSWGYSS ETKFEITLNN KDSLVGDQDT LASFGIVSGD LICLILEDDA
Human      QSLLTWGYSS NTRFTITLNY KDPLTGDEET LASYGIVSGD LICLILQDDI

           QTSSAAHQTH TDQQSSQECV DLQQDCMDQQ QQQEQECVCA AAPPLLCCEA
           DAPDRDDGGG HSEQVSQEAK LPDASGASTD SDQAPGPAAS CWEPMLCSET
           LPPPSSSPPS LQNGKNGSSL EFPSGLVPED VDLEEGTGSY PSEPMLCSEA
           LPPPPATPAP LQNGNNGSSL EFPSGLVPED ADLEEGTGSY PSEPMLCSEA
           SSPSSSLPSS QSNHHSGPSQ EFTSEGGPDD LDLQEATGSF PSEPMLCCEA
           IPSSTSEHSS LQNNSNGPSQ NFEAESIQDN AHMAEGTGFY PSEPMLCSES

           EDGLLPLALE RLLDSSTCRS PSDCLMLALH LLLLETGFIP QGGAVSSGEM
           DEGQAPWSLE LLYHSAQVSG PGDALVVAAN LLMIETGFSP QDSQLKPAEM
           ADGEIPHSLE VLYLSAECTS ATDALIVLVH LLMMETGYVP QGTEAKAVSM
           ADGETPHSLE MLYLSAECTS ATDALIVLVH LLMMETGYVP QGIEAKAVFM
           TDGQVPHSLQ TLYHSAECTN ANDALIVSIH LIMMETGYVP QGTEAKASSM
           VEGQVPHSLE TLYQSADCSD ANDALIVLIH LLMLESGYIP QGTEAKALSM

           PIGWQAAGVF RLQYVHPLLE NSLVSVVAVP MGQTLVINAV LKMETSLENS
           PAGWRCGGVY KLQYSHRLCG DSVVVMVAVS MGSALIINGL LEVNQSADSV
           PEKWRGNGVY KLQYTHPLCE EGSAGLTCVP LGDLVAINAT LKINREIKGV
           PEKWRGNGVY KLQYTHPLCG EGCAGLTCVP LGDLIAINAT LKINEEIRSV
           PENWRNKGVY KLLYTHPLCE NGFAVLTCVP LGNLIVVNAM LKITSDIKSV
           PEKWKLSGVY KLQYMHPLCE GSSATLTCVP LGNLIVVNAT LKINNEIRSV
    .... # had to truncate the file as the comment character limit is 5000....
ADD REPLY
1
Entering edit mode

Yes this is why I asked you to post the command please - copy and past the exact command you typed.

sys.argv[] doesn't cause an error under python3. Something that will however is that in python3 strings are already stored as unicode and do not need conversion, therefore the line:

iter = AlignIO.parse(io.StringIO(unicode(fasta)), 'fasta')

can be changed to remove the unicode() call:

iter = AlignIO.parse(io.StringIO(fasta), 'fasta')
ADD REPLY
0
Entering edit mode

Command I was using:

python convert.py input.txt outfile.phylip

This command is working but I do not want to use command, rather I want to take a user input and then result a phylip format file using the input text file. How do I do that?

For example, if I use this code as:

fileInput = input("Enter your file name: ")
with open(sys.argv[fileInput], 'r') as ifh:
    fasta = ''.join('>%s\n%s\n' % (i[0], i[1]) for i in [str.split() for str in ifh] )

iter = AlignIO.parse(io.StringIO(fasta), 'fasta')
AlignIO.write(iter, sys.argv[2], "phylip")

It outputs the following error message:

Enter your file name: input.txt
    with open(sys.argv[fileInput], 'r') as ifh:
TypeError: list indices must be integers or slices, not str
ADD REPLY
2
Entering edit mode

I'm not sure I fully understand; you want an interactive user menu? Personally, I find there is rarely a case where an interactive commandline is preferable to just a single command/script.

If you want to read the STDIN, you can still do that with the sys module, but you'll need to alter .argv[] to something using sys.stdin.

If you want to go down that path, you need to do some reading of your own as it will be much more complicated than the help this forum is set up to provide.

Here are some links that might get you started:

https://stackoverflow.com/questions/1450393/how-do-you-read-from-stdin-in-python

https://stackoverflow.com/questions/8878753/how-to-make-interactive-python-script-with-keyboard-arrows-navigation-in-menu

https://stackoverflow.com/questions/6218890/python-how-can-i-read-stdin-from-shell-and-send-stdout-to-shell-and-file

ADD REPLY
0
Entering edit mode

Alright then, these links will definitely be useful. I will follow these for sure. Thankyou so much for this guidance.

ADD REPLY
2
Entering edit mode

You cannot do:

 with open(sys.argv[fileInput], 'r') as ifh:

That's not what square brackes are for.

You need to do this:

import sys
from Bio import AlignIO
import io

file = str(input("File: "))
out = str(input("Output: "))

with open(file, 'r') as ifh:
    fasta = ''.join('>%s\n%s\n' % (i[0], i[1]) for i in [str.split() for str in ifh] )
    print(fasta)

iter = AlignIO.parse(io.StringIO(fasta), 'fasta')

AlignIO.write(iter, out, "phylip")

Command/input

$ python3 txt2phy_input.py
File: data.txt
Output: phy.phy
ADD REPLY
0
Entering edit mode

Oh okay, I'll try this way. Thankyou for the help.

ADD REPLY
0
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLY
0
Entering edit mode

so I am writing the codes in a file.py and then running it to check the output

how are you running it?

ADD REPLY
0
Entering edit mode

I ran it by giving the command:

python convert.py input.txt outfile.phylip

This is the command which just worked now , at last and I converted the file to another format successfuly.

ADD REPLY

Login before adding your answer.

Traffic: 2169 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6