Question

Quicker way to know if a SNV is involved in a mismatch alignment

0

Entering edit mode

3.5 years ago

MarVi ▴ 30

Dear all,

I have an advice to ask. I have a collection of alignments, of which I want to know if the mismatches found in the reads when comparing with the reference sequence (genome) are due to an SNV. I have all the SNV's noted down and stored per chromosome in python dictionaries. However, the process of loading the dictionary (cPickle) for the current chromosome dictionary being researched takes a long time. Do you have any suggestions on how to make this process faster in the python, to look up a position on the chromosome if there is an SNV involved in that position?

Thanks in advance! Hope everyone is fine!

python cPickle Dictionary mismatch alignment • 694 views

ADD COMMENT • link updated 3.5 years ago by JC 13k • written 3.5 years ago by MarVi ▴ 30

score 0 · Answer 1 · 2020-10-14

There are multiple options:

Convert your SNV in a VCFs, sort and index with Tabix, you can read/call from python with pyvcf or similar packages
Save your data in a real database (postgres, mysql, mongodb, ...) and use a db connector in python
Reorganize your data, one table per chromosome is not optimal as many chromosomes are large