Closed:freebayes VCF get average base quality of supporting bases of variant
0
0
Entering edit mode
4.2 years ago
gb ★ 2.2k

I am fairly new to variant calling and somehow I cannot find an answer to my question. Maybe I use the wrong the search terms...

I have a VCF file from freebayes and I want to know the average base quality of the bases that support that certain variant.

So for example I have this:

POS ID  REF ALT
200 .   T   C

From all the C bases that are mapped on position 200 I want to calculate the average base quality score that you normally see in the FASTQ file (And QUAL column of SAM file). At this moment I am not sure if a tool to do this already exist or that I need create something myself.

EDIT:

Started with making my own script using mpileup but ran into another problem. In short, the length of the base column is not the same as the length of the quality score column. After some filtering I got a position with 101 bases and 100 quality scores. In the bases I noticed this (example):

.^.^A^,^,^,^,g^,^,^,^,^W,

One of my filter step is:

re.sub(r'\^[^actgnNACTG\.\,]', '', bases)
bases = bases.replace("^", "")

But with this filter step I remove one base to many. How can I know the character after the ^ is a mismatch/match or a ASCII score.

SNP vcf • 202 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2383 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6