Deleted:What filters do I use on my variant calls (vcf.gz) file for imputation?
0
0
Entering edit mode
12 months ago
Olivia • 0

Hi! After about 2 full days of research and reading so many papers, I am still super stuck on this question:

What site filters do I need to use on my vcf file to prepare it for imputing?

Some details:

  • Data consists of 84 individuals of an inbred bird species. After variant calling and before any filtering, 72 individuals have an average depth of around 12x but 12 individuals have an average depth of <5x. I am hoping to impute just these <5x individuals, using the rest as a reference panel?
  • I will only be imputing a few contigs that I need for haplotyping.
  • Imputation software is possibly QUILT or STITCH (w/ or w/o a reference panel) - I am undecided and was going to try all 3

For haplotyping of the higher coverage (>5x) individuals I applied a few filters: (cutoff) min GQ 10, MAF<0.05, min depth <5x, max missingness 0.1, strand bias adjusted phred score 60, max depth <200. I have no idea if I am meant to apply the same filters to all of my individuals before imputing or not! So many papers don't mention any pre-imputation filtering, some have vague mentions, and I am just confused because without filtering, I thought a lot of my data would be pretty poor for making any haplotype (or imputation) judgements.

Should I be filtering my sites so that poor sites are missing, and only good sites remain for imputation? Or do I need to retain as much info as possible, and perhaps filter after imputation? I am so lost so any guidance is greatly appreciated! Thank you!!

vcf imputation filter quilt • 223 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6