Annotating file using bcftools
1
0
Entering edit mode
14 days ago
kl ▴ 10

Hi all,

I am trying to annotate my imputed genetic file using bcftools and then want to convert it to plink.

bcftools index -t ro_imputed_hrcgrch37.R2_0.3.vcf.gz
bcftools annotate \
-a $DATADIR/ro_imputed_hrcgrch37.R2_0.3.vcf.gz \
-c ID $REF/All_20180423.vcf.gz \
--output-type z \
-o $DATADIR/ro_imputed_hrcgrch37.R2_0.3.vcf_dbSNP151.vcf.gz

This seems to work but my samples are removed. Consequently, when I try to convert to binary plink files, it doesn't work because it says I have no samples. Can anyone give advice on what I've done wrong?

Many thanks

annotation plink bcftools • 311 views
ADD COMMENT
0
Entering edit mode

Don't forget to follow up on your threads. If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work. If an answer was not really helpful or did not work, provide detailed feedback so others know not to use that answer.

Upvote|Bookmark|Accept

ADD REPLY
1
Entering edit mode
14 days ago

I think your're annotating $REF/All_20180423.vcf.gz (DBSNP isn't it ? = no genotype) with your vcf as the database ro_imputed_hrcgrch37.R2_0.3.vcf.gz but your want the reverse : annotate your vcf with dbsnp.

bcftools annotate \
-a $REF/All_20180423.vcf.gz  \
-c ID $DATADIR/ro_imputed_hrcgrch37.R2_0.3.vcf.gz \
--output-type z \
-o $DATADIR/ro_imputed_hrcgrch37.R2_0.3.vcf_dbSNP151.vcf.gz
ADD COMMENT
0
Entering edit mode

Thanks - I corrected it. It doesn't seem to annotate. I converted to binary after which is what is shown below. It is not the output I want. The second column I wanted to be the rsid extracted from the All_2018423.vcf.gz file. based on chromosome, position and allele match. I would appreciate any suggestions. I haven't used bcftools before.

22 22:51218224:C:A 0 51218224 A C 22 22:51218377:G:C 0 51218377 C G 22 22:51219006:G:A 0 51219006 A G 22 22:51219387:T:C 0 51219387 C T 22 22:51221190:G:A 0 51221190 A G 22 22:51221731:T:C 0 51221731 C T 22 22:51222100:G:T 0 51222100 T G 22 22:51223637:G:A 0 51223637 A G 22 22:51229805:T:C 0 51229805 C T 22 22:51237063:T:C 0 51237063 C T

Thanks

ADD REPLY
0
Entering edit mode

It worked with this bcftools index -t $DATADIR/cpro_imputed_hrcgrch37.R2_0.3.vcf.gz

bcftools annotate  \
-a $RefGenomes/All_20180423.vcf.gz  \
-c CHROM,FROM,TO,ID $DATADIR/ro_imputed_hrcgrch37.R2_0.3.vcf.gz \
-output-type z \
-o $DATADIR/ro_imputed_hrcgrch37.R2_0.3.vcf_dbSNP151.vcf.gz
ADD REPLY
0
Entering edit mode

Do you know if there is a way to leave chr:pos in ID if there is no matching rsid based on chromosome and position, as variants have been reduced by half.

ADD REPLY

Login before adding your answer.

Traffic: 2630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6