An error occurred in bcftools view - S
1
0
Entering edit mode
19 months ago
yoser4 ▴ 10

Hello everyone

I have a VCF file with 99 samples. I want to split it to get a subset (multiple VCFs, each of which has a specific variety).

The code I use is as follows (I use loops,${id} is the name of the variety. The content in sampleID.txt has only one column, which is the name of the sample):

ls *_sampleID.txt | cut -d "_" -f 1 | while read id
do
        bcftools view -S ${id}_sampleID.txt  snps.vcf.gz  -Oz > ${id}.vcf.gz
done

However, an error occurred in the output result: the original file has about 40 million lines. Normally, the output file should have the same number of lines as the original file, but there is only a difference in the number of columns. However, the output file I get is only about 10 million lines. I don't know what the problem is.

Any help will be appreciated^-^

bcftools • 553 views
ADD COMMENT
1
Entering edit mode
19 months ago

Isn't that because there are positions where your sample does not contain variants thus, those positions are not output?

the converse is also true. If you have two single-row VCF files with different positions, after merging, you would have two rows where each sample would indicate no-variant in one of the columns.

I want to also note that an output you don't fully understand is not necessarily an "error"... thinking about it as an error works against understanding it later.

ADD COMMENT

Login before adding your answer.

Traffic: 2315 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6