Entering edit mode
10.5 years ago
Zev.Kronenberg
12k
I love Tabix and have rolled it into a software library. Most of the data we are working with are de-novo assemblies that have more than 1000 seqids...
I found something that is worrying me:
$tabix->getnames() only returns 1000 seqids ???
I wrote to the mailing list, but thought I would ask here as well.
http://sourceforge.net/p/samtools/mailman/message/31560407/
Thanks!
EDIT:
cat my.vcf | grep -v "#" | awk '{print $1}' | sort | uniq -c | wc
1103
So I know I have more than 1000 seqids in the file...
Are you sure it's not an issue with line endings or format? It sounds simple, but that always throws off line/sequence counts.