Question: tabix indexing on a non vcf/bed/sam txt file
0
gravatar for PedroBarbosa
18 days ago by
PedroBarbosa10
Braga, Portugal
PedroBarbosa10 wrote:

Hello,

I'm struggling to get the tabix index on a simple 3 columns bgzipped txt file:

#chr    pos     score
1       1       0.061011
1       2       0.061011
1       3       0.061011
...

Oddly, the indexing step is really fast (like 2 seconds), considering the file size (9Gb) and when a query a position I get no result without any warning. Has anyone faced a similar issue ?

tabix -s1 -b2 file.txt.bgz

tabix file.txt.bgz 1:2-3 -> empty result

Thanks in advance,

Pedro

next-gen software error • 80 views
ADD COMMENTlink written 18 days ago by PedroBarbosa10

This works for me. Some troubleshooting questions:

  • Are there any messages during the index creation?
  • Is the file tab delimited?
  • Is the file sorted by the first and second column?
  • Is the file compressed by bgzip?
  • Have you tried your little example as well, or just your large data file?

fin swimmer

ADD REPLYlink written 18 days ago by finswimmer11k

Indeed, it worked for my little example. I'm now running a large sort on the file (sort -V -k1,1 -k2,2) to see if this was the problem. Although I wasn't expecting that as I zcatted all chromosome files in the proper order, and in theory I donwloaded them already sorted.

Thanks for the suggestions, i'll let you know how it went.

ADD REPLYlink written 17 days ago by PedroBarbosa10

@finswimmer, it didn't work, unfortunately.

These are my full commands, if you see any possible source of error let me know. This "wrong" index takes 2 seconds to be created. Never happened before.

header_file=$(head -n1 $files) 
zcat $header_file | head -1 | cut -f1,2,3 | bgzip > fitcons_v1.01_header.txt.bgz
srun cat $files | xargs zcat | grep -v "^#" | sort -V -k1,1 -k2,2 | awk -v OFS='\t' '{print $1,$2,$3}' |  bgzip > fitcons_v1.01.txt.gz
srun cat fitcons_v1.01_header.txt.bgz fitcons_v1.01.txt.gz > fitcons_v1.01.txt.bgz
tabix -s 1 -b 2 fitcons_v1.01.txt.bgz
ADD REPLYlink written 17 days ago by PedroBarbosa10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1110 users visited in the last hour