determine unique snp

0

Entering edit mode

8.7 years ago

slimane.khayi ▴ 80

Dear colleagues, I have tab-limited file in this format

#CHROM                  POS REF  a1  a2    b1   b2   c1  c2
NW_008246507.1  16  T   C/C C/C T/T C/C C/C T/C
NW_008246507.1  1624    A   C/C C/C C/C C/C C/C C/C
NW_008246507.1  1656    C   T/T T/T T/T T/T T/T T/T
NW_008246507.1  1666    C   T/T T/T T/T T/T T/T T/T
NW_008246507.1  1679    C   T/T T/T T/T T/T T/T T/T
NW_008246507.1  1681    G   A/A A/A A/A A/A A/A A/A
NW_008246507.1  1682    T   A/A A/A A/A A/A A/A A/A
NW_008246507.1  1695    T   C/C C/C C/C C/C C/C C/C

I want to identify the unique SNPs for each species a, b, c (not strain a1, a2, b1..),have you any python script or any idea to do this job, I am not familiar within scripting languages. Thank you in advance for your help. Sincerely.

SNP python samtools vcftools mapping • 2.0k views

ADD COMMENT • link updated 8.7 years ago by Pierre Lindenbaum 166k • written 8.7 years ago by slimane.khayi ▴ 80

0

Entering edit mode

Could you please clarify what is a unique SNP in your example data, I find it difficult to see (and I think you are missing a couple of line breaks, as there are currently two positions per line...) -- Would T/C for species c at NW_008246507.1:16 be what you are looking for?

ADD REPLY • link 8.7 years ago by cschu181 ★ 2.8k

0

Entering edit mode

Here is a capture from my file

ADD REPLY • link 8.7 years ago by slimane.khayi ▴ 80

0

Entering edit mode

ADD REPLY • link 8.7 years ago by slimane.khayi ▴ 80

Login before adding your answer.