Is there a simple way to get the max coverage of a bedgraph file in python?
2
0
Entering edit mode
6.4 years ago
rioualen ▴ 620

Everything is in the question! Basically I guess I should load the bedgraph file and get the maximum value from the coverage column, however I can't seem to get this done.

I've tried using numpy:

>>> np.loadtxt('GSM1470159_sickle-se-q20_bwa.bedgraph', usecols=3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 722, in loadtxt
usecols = list(usecols)
TypeError: 'int' object is not iterable


This kind of stuff is a lot easier to deal with in R, however it looks super complicated to use R in python for such a simple task: https://sites.google.com/site/aslugsguidetopython/data-analysis/pandas/calling-r-from-python.

I must be missing something here, if you can help me thanks in advance!

python bedgraph • 1.6k views
1
Entering edit mode
0
Entering edit mode

Oh thanks, indeed! In the meantime I found another solution with pandas (see under).

0
Entering edit mode
6.4 years ago
rioualen ▴ 620

Seems I found an easier way of doing it with pandas library:

import pandas as pd
cov = tab.iloc[:,3]
max = int(cov.max())

0
Entering edit mode
6.4 years ago

You could do a reverse-numerical sort on the fourth column and pull off the first value:

\$ cut -f4 foo.bedgraph | sort -nr | head -1 > answer.txt