Hello i am trying to change my nebula Genomics report to 23 and me Format i have to problems nebula uses 38 human ensemble and 23 and me 37, I was thinking to do a python script but i have some doubts:
My plan was to change the genotype according to 23 and ME format (the two copies of each allele) using the genotipe provided in VCF 0/1 .. etc.. 0 means Reference i means i-th alternate right?
But i don't understand for example this
chr4 169311085 rs199775492 ATTT A ,AT 1/2
The reference is ATTT ? and the sample genotype would be: A AT right?
but 23and me format does not admit that: As far as i understand the genotype to 23 and me would be DD ? right? since both alleles have deletions? I am understanding correctly?
after that is done for changing ensamble i need to change all de ID and physical coordintates, right? where can i get such a map .. I am thinking to use mongoDB or sqlite3 i dont know what database would be better.
If there is some software that does all that for me i would be very happy but i haven't found any i found this script
https://github.com/2sh/vcf-to-23andme/blob/master/data_to_db.py
but i don't know why i does not work... its pretty old i guess maybe it asummes i have the same ensamble 37 human ensemble at the beginning, but i have been aligned using the 38 (i think since my VCF file says ##reference=file:///mnt/ssd/MegaBOLT_scheduler/reference/hg38.fa)
I am understanding things correctly ??