Using tabix to read a tsv.bgz with genomic coordinate in single column<chromosome>:<loc>:<ref>:<alt>
1
0
Entering edit mode
4.8 years ago

I'm trying to read some of the data for the UK Biobank Imputed GWAS seen here: https://docs.google.com/spreadsheets/d/1kvPoupSzsSFBNSztMzl04xMoSC3Kcx3CrjVf4yBmESU/edit?ts=5b5f17db#gid=227859291

The data comes in a tsv with the first column having the contig:location:ref:alt ... Is there a simple way for tabix to consume and search this data. It seems like it should be relatively simple... or do I need to pipe the data through something like awk?

tabix • 4.2k views
ADD COMMENT
3
Entering edit mode
4.8 years ago

According to the description, you have a CRHOM name in the second column and a POS value in the third. If this file is sorted you can tabix index the file like this:

$ tabix -s 2 -b 3 -e 3 variants.tsv.bgz

If this is successfully you can use tabix in the usual way.

fin swimmer

ADD COMMENT

Login before adding your answer.

Traffic: 1944 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6