I wanted to fetch only certain columns from a tabix indexed file. Using Pysam's fetch command, I get the entire row in a string format. However, I would like to query only the exact column. Is there a quick way to accomplish this?
Could you be more specific? Do you want to fetch only certain columns from a subset of rows, or do you just want certain columns from the whole data?
Use awk on pysam output for exact column .
If the OP is using pysam, it's a good guess that they are already in a Python environment. awk, while extremely useful, is a worse fit than simply accessing the AlignedRead object returned by the IteratorRow object returned by Samfile.fetch().
region = Samfile.fetch('chr1:1-1000')
positions = [read.pos for pos in region]
I am specifically working with TSV files. So, I here is an example of TSV file:
When I use fetch to get row for chr 1 and pos 2399, I get the following: 1\t2399\tTrue
However, I want a way to directly access judgement value for chr 1 and pos 2399. I'd like to directly call the exact column value. I cannot use awk. I was wondering if there is a direct way of doing so using Pysam. Thank you for all your help.
row = Tabixfile.fetch(reference='chr1', start=2399, end=2399, parser=asTuple)
judgement = row
tabixfile = pysam.Tabixfile( "/usr/local/xyz.vcf" )
and now when I m doing this:
row = tabixfile.fetch(reference='chr1', start=249240539, end=249240539)
but while doing
x = row
I am getting this error.
Where I am going wrong actually ? Any pointer ?
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy