Question: Python script for ATGC count
0
gravatar for 3335098459
5 weeks ago by
333509845920
333509845920 wrote:

Hi, I am new to Python so if this query seems to you a piece of cake. Please I apologize in advance.

I have a python script to count ATGC

def readGenome(filename):
genome = ''
with open(filename, 'r') as f:
    for line in f:
        # ignore header line with genome information
        if not line[0] == '>':
            genome += line.rstrip()
return genome 

genome = readGenome(filename)
genome[:100]
# Count the number of occurences of each base
counts = {'A': 0, 'C': 0, 'G': 0, 'T': 0}
for base in genome:
    counts[base] += 1
print(counts)

import collections
print(collections.Counter(genome))

I cannot figure out the problem in this code. As I run it on command prompt such as

<command prompt=""> Python count_ATGC.py gene.fa

It gives me the error that

Traceback (most recent call last):

File "count_ATGC.py", line 9, in <module>

genome = readGenome()

TypeError: readGenome() missing 1 required positional argument: 'filename'

Could somebody help me with this error?

Thanks

ADD COMMENTlink modified 5 weeks ago by antonioggsousa1.3k • written 5 weeks ago by 333509845920

Yes, you are not giving the fasta file string directory to the function readGenome(). So,

genome = readGenome(filename = "/path/to-the-genome-fasta-file.fasta")

filename is an argument that takes, I guess, a string specifying the genome in fasta format (I guess), in your computer. So the error is related with that, you're not giving input to the positional argument filename.

António

ADD REPLYlink written 5 weeks ago by antonioggsousa1.3k

3335098459 : If you expect your program to accept command line input/arguments from the command line you need to include code necessary to parse that input.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax89k
0
gravatar for antonioggsousa
5 weeks ago by
antonioggsousa1.3k
antonioggsousa1.3k wrote:

Hi,

Try the following script applied in the same way you did:

import sys

filename = sys.argv[1]

def readGenome(filename):
    genome = ''
    with open(filename, 'r') as f:
        for line in f:
        # ignore header line with genome information
            if not line[0] == '>':
                genome += line.rstrip()
    return genome

genome = readGenome(filename)
genome[:100]
# Count the number of occurences of each base
counts = {'A': 0, 'C': 0, 'G': 0, 'T': 0}
for base in genome:
    counts[base] += 1
print(counts)

import collections
print(collections.Counter(genome))

The filename = sys.argv[1] will import the second input argument after python, i.e., python count_ATGC.py gene.fa, gene.fa and assign it to filename variable.

I hope this helps,

António

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by antonioggsousa1.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1118 users visited in the last hour