Bcftools unsorted positions?
0
0
Entering edit mode
4.1 years ago
vctrm67 ▴ 50

I am trying to sort a particular vcf file using bcftools for use for another software tool (MuSE). I tried running bcftools sort file.vcf > newfile.vcf, which ran without errors, but I get this message from the software tool: [E::hts_idx_push] Unsorted positions on sequence #24: 13474350 followed by 9950583. I'm not sure where these numbers are coming from, since I tried running grep 13474350 file.vcf and grep 9950583 file.vcf but don't see anything there. Am I missing something?

bcftools • 4.7k views
ADD COMMENT
0
Entering edit mode

version of bcftools ? is there a dictionary ('##contig' lines in the header) ,

ADD REPLY
0
Entering edit mode

Try grepping for 13474351 and 9950584: VCF coordinates are 1-based while bcftools uses 0-based coordinates internally.

ADD REPLY
0
Entering edit mode

@Pierre Yes there is a dictionary, and it's version 1.9.
@chrchang Tried it but didn't output anything either...

ADD REPLY
0
Entering edit mode

Ok, just took a quick look at the relevant htslib source code (hts.c line 1851 in the current develop branch) and it does convert back to 1-based coordinates when printing an error message.

You may need to post an example file and command which can be used to reproduce the error to get useful help at this point.

ADD REPLY
0
Entering edit mode

This is really odd. Why would bcftools sort throw an error that says that the input file is unsorted? Is it not the job of bcftools sort to do the sorting? I have a feeling that the file you're trying to sort and the file you're grep-ing are not the same file.

Are you piping stuff to bcftools sort, perhaps?

ADD REPLY
0
Entering edit mode

Sorry, I can see how the post is confusing. Edited for clarification.

ADD REPLY
0
Entering edit mode

Do you see any results when grep-ing for those positions in newfile.vcf? Does MuSE say anything else?

Did you get the file.vcf file by subsetting a different VCF file using a bed file perhaps? If so, overlapping regions in the BED file could cause duplicate entries occurring in different positions in the resultant VCF file.

ADD REPLY

Login before adding your answer.

Traffic: 1607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6