Extract variant positions from single sample VCFs prior to merging ?
1
0
Entering edit mode
8.7 years ago
stevenlang123 ▴ 210

Hi y'all,

I just completed some QC filtering on about 100 single sample VCFs. As a result of filtering, each VCF no longer contains the same set of variant positions (some positions which failed QC in one, passed in another). I would like to obtain a list of of variant positions which are shared among all VCFs so that I can obtain files which are all of the same length and can be merged into a single multi-sample VCF. Any suggestions on how to do this?

Thanks in advance.

Best,
Steve

sequencing SNP sequence • 2.5k views
ADD COMMENT
0
Entering edit mode

Just comment and thought: Right now I am doing similar thing. I also called SNP for each sample separately and merged later on using bcftools merge. Later on realized that shared position (if no SNP in one sample but may present in other against reference) will not have any information about read coverage which we already discarded during SNP calling. So later on while merging only had skewed data: only het and alternate allele to reference in positions-- no allele in sample similar to reference.

ADD REPLY
1
Entering edit mode
8.7 years ago

You don't need to have vcf files with same length to merge them.

Post not found

ADD COMMENT
0
Entering edit mode

Will the resulting file contain only shared positions? I can't have missing genotypes.

Thanks,
Steve

ADD REPLY
0
Entering edit mode

You can use vcftools for that. --max-missing or --max-missing-count

ADD REPLY
0
Entering edit mode

Great, thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 2489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6