Question: Pysam Fetch Behavior
1
gravatar for rickyflintoff
3.9 years ago by
rickyflintoff30 wrote:

I wanted to fetch only certain columns from a tabix indexed file. Using Pysam's fetch command, I get the entire row in a string format. However, I would like to query only the exact column. Is there a quick way to accomplish this?

tabix • 3.2k views
ADD COMMENTlink modified 3.8 years ago by Matt Shirley7.2k • written 3.9 years ago by rickyflintoff30

Could you be more specific? Do you want to fetch only certain columns from a subset of rows, or do you just want certain columns from the whole data?

ADD REPLYlink written 3.9 years ago by Matt Shirley7.2k

Use awk on pysam output for exact column .

ADD REPLYlink written 3.9 years ago by always_learning780
1

If the OP is using pysam, it's a good guess that they are already in a Python environment. awk, while extremely useful, is a worse fit than simply accessing the AlignedRead object returned by the IteratorRow object returned by Samfile.fetch().

region = Samfile.fetch('chr1:1-1000')
positions = [read.pos for pos in region]
ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Matt Shirley7.2k

I am specifically working with TSV files. So, I here is an example of TSV file: chr\tpos\tjudgement 1\t2389\tTrue 1\t2399\tTrue When I use fetch to get row for chr 1 and pos 2399, I get the following: 1\t2399\tTrue However, I want a way to directly access judgement value for chr 1 and pos 2399. I'd like to directly call the exact column value. I cannot use awk. I was wondering if there is a direct way of doing so using Pysam. Thank you for all your help.

ADD REPLYlink written 3.9 years ago by rickyflintoff30
1
gravatar for Matt Shirley
3.8 years ago by
Matt Shirley7.2k
Cambridge, MA
Matt Shirley7.2k wrote:
row = Tabixfile.fetch(reference='chr1', start=2399, end=2399, parser=asTuple)
judgement = row[2]
ADD COMMENTlink written 3.8 years ago by Matt Shirley7.2k
import pysam
tabixfile = pysam.Tabixfile( "/usr/local/xyz.vcf" )

and now when I m doing this:

row = tabixfile.fetch(reference='chr1', start=249240539, end=249240539)

but while doing

x = row[2]

I am getting this error. 

Where I am going wrong actually ? Any pointer ? 

ADD REPLYlink written 2.7 years ago by always_learning780
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 975 users visited in the last hour