Query a single sample VCF for a list of rsid, do calculations and generate a personal genomics report
1
0
Entering edit mode
3.0 years ago
nico77 • 0

Hi, I need to generate a personal genomics report based on some rsid (mainly SNPs), starting from single sample VCFs. Till now, I manually search for the genotype (0/0, for example) of every rsid,I copy the value in an excel file, I generate the corresponded genotype (AA, for example) and i trigger some rules to generate some results (for example: if genotype is AA: low risk, if genotype is AC increased risk...). Now, I want otop create some scripts to query the VCF for a list of rsid, have the correspondent genotypes, do some calculations like in excel, then generate an output file that I can switch to a readable report.

I seee there are solutions like scikit + numpy or pèandas, or similar, or, to convert the VCF in Parquet and then use cloud solutions in Google Cloud, Amazon, or Azure...but I not know if this is the better approach....

Do you have some ideas? Thanks!

personal query genomics VCF • 1.2k views
ADD COMMENT
1
Entering edit mode
3.0 years ago

Now, I want otop create some scripts to query the VCF for a list of rsid

https://samtools.github.io/bcftools/bcftools.html#expressions

bcftools view -i 'ID=@filelistofrsid.txt' input.vcf

have the correspondent genotypes

https://samtools.github.io/bcftools/bcftools.html#query

bcftools query -f [%ID %TGT \n];

do some calculations like in excel, then generate an output file that I can switch to a readable report.

awk https://www.google.com/search?q=awk+tutorial

ADD COMMENT
0
Entering edit mode

Thank you Pierre.

So, I have to create a text file with the list of RSID I want to query (filelistofrsid.txt), right? Then bcftools will search for the correspondent genotypes in my VCF (input.vcf), right? And the output would be what kind of file? Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2759 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6