How to extract postions from gnomad using tabix?
1
2
Entering edit mode
3.1 years ago
DareDevil ★ 4.3k

Hi I have list of positions around 2 lakhs and I want to map those regions in gnomad. I tested for one postion uisng tabix. But it extracts other positions as well.

tabix af-only-gnomad.hg38.vcf.gz chr1:10217-10217

output

chr1    10114   .       TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTA CAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTA,T       36729   PASS    AC=9,1;AF=0.0002246,2.496e-05
chr1    10211   .       AACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTAAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCAACCCCAACCCCAACCCCAACCCCAACCCCAACCCTAACCCCTAACCCTAACCCTAACCCT A       6206.06 PASS    AC=2;AF=5.319e-05
chr1    10217   .       A       T       20051.80        PASS    AC=2;AF=5.288e-05

but I need only following as ouput. Where am I doing wrong?

chr1    10217   .   A   T   20051.80    PASS    AC=2;AF=5.288e-05
tabix gnomad vcf • 1.7k views
ADD COMMENT
0
Entering edit mode

Can you check if tabix af-only-gnomad.hg38.vcf.gz "chr1:10217-10217" works.

ADD REPLY
0
Entering edit mode

I tried the same, but does not work

ADD REPLY
0
Entering edit mode

I just cross checked and this works smoothly. Can you again perform bgzip and tabix to check if indexing was performed accordingly.

ADD REPLY
0
Entering edit mode

I have done bgzip af-only-gnomad.hg38.vcf followed by tabix -p vcf af-only-gnomad.hg38.vcf.gz. Then,

tabix af-only-gnomad.hg38.vcf.gz "chr1:10217-10217"

output:

chr1    10114   .   TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTA  CAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTA,T    36729   PASS    AC=9,1;AF=0.0002246,2.496e-05
chr1    10211   .   AACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTAAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCAACCCCAACCCCAACCCCAACCCCAACCCCAACCCTAACCCCTAACCCTAACCCTAACCCT  A   6206.06 PASS    AC=2;AF=5.319e-05
chr1    10217   .   A   T   20051.80    PASS    AC=2;AF=5.288e-05
ADD REPLY
5
Entering edit mode
3.1 years ago

tabix is right

for example the 1st variant.

chr1    10114   .       TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTA CAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAACCCTA,T       36729   PASS    AC=9,1;AF=0.0002246,2.496e-05

would overlap "chr1:10217-10217" because the first variant starts at 10114 and ends at 10114+length(REF) = 10398. So chr1:10114-10398 overlaps your query chr1:10217-10217

you could use bcftools view --types snps gnomad.vcf.gz chr1:10217-10217

ADD COMMENT
0
Entering edit mode

perfect !!!!... thanks :)

ADD REPLY
0
Entering edit mode

Dear Sir, Thankyou so much.

ADD REPLY

Login before adding your answer.

Traffic: 2512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6