How to read vcf file in python?
2
0
Entering edit mode
2.3 years ago
ja4123 • 0

When I try to do simply like this:

import vcf
vcf_reader = vcf.Reader(filename="in.vcf.gz")

there is an error:

AttributeError: partially initialized module 'vcf' has no attribute 'Reader' (most likely due to a circular import)

But vcf module has that attribute .. Kindly help.

vcf reader python • 15k views
ADD COMMENT
2
Entering edit mode

also it sounds like your installation of pyvcf is messed up. I would consider trying the version in conda; https://anaconda.org/bioconda/pyvcf

ADD REPLY
1
Entering edit mode

I always read it by pandas (after removing the heads).

ADD REPLY
1
Entering edit mode

personally, I just use GATK VariantsToTable to convert it to a .tsv first. Its much easier to parse this way. Unless you wanted something from the header? Another option might to be convert to another tabular format such as .maf

ADD REPLY
3
Entering edit mode
2.3 years ago
onestop_data ▴ 300

Try Pysam . You can easily pip install it (pip install pysam)

ADD COMMENT
2
Entering edit mode
10 months ago
d.vitale199 ▴ 20

I like to use Pandas. I find the line that starts with '#CHROM', split that row to make a list of names for names=<list of names>, and read in chunks with comment='#'

import pandas as pd
import gzip

def get_vcf_names(vcf_path):
    with gzip.open(vcf_path, "rt") as ifile:
          for line in ifile:
            if line.startswith("#CHROM"):
                  vcf_names = [x for x in line.split('\t')]
                  break
    ifile.close()
    return vcf_names


names = get_vcf_names('file.vcf.gz')
vcf = pd.read_csv('file.vcf.gz', compression='gzip', comment='#', chunksize=10000, delim_whitespace=True, header=None, names=names)
ADD COMMENT
0
Entering edit mode

I have zip file instead of gzip so how can I change my code?

ADD REPLY

Login before adding your answer.

Traffic: 2077 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6