VCF to 23 and Me format and changing ensamble reference help needed for underestanding VCF
0
0
Entering edit mode
2.7 years ago

Hello i am trying to change my nebula Genomics report to 23 and me Format i have to problems nebula uses 38 human ensemble and 23 and me 37, I was thinking to do a python script but i have some doubts:

My plan was to change the genotype according to 23 and ME format (the two copies of each allele) using the genotipe provided in VCF 0/1 .. etc.. 0 means Reference i means i-th alternate right?

But i don't understand for example this

chr4 169311085 rs199775492 ATTT A ,AT 1/2

The reference is ATTT ? and the sample genotype would be: A AT right?

but 23and me format does not admit that: As far as i understand the genotype to 23 and me would be DD ? right? since both alleles have deletions? I am understanding correctly?

after that is done for changing ensamble i need to change all de ID and physical coordintates, right? where can i get such a map .. I am thinking to use mongoDB or sqlite3 i dont know what database would be better.

If there is some software that does all that for me i would be very happy but i haven't found any i found this script

https://github.com/2sh/vcf-to-23andme/blob/master/data_to_db.py

but i don't know why i does not work... its pretty old i guess maybe it asummes i have the same ensamble 37 human ensemble at the beginning, but i have been aligned using the 38 (i think since my VCF file says ##reference=file:///mnt/ssd/MegaBOLT_scheduler/reference/hg38.fa)

I am understanding things correctly ??

23andMe VCF • 753 views
ADD COMMENT

Login before adding your answer.

Traffic: 1980 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6