Entering edit mode
                    10.0 years ago
        bingnas
        
    
        ▴
    
    10
    Hi all
I called six SNP's files as individual, I want to merge them such that considering the position and location. I want to do that for converting them as integer numbers 0,1,2.
The question is:
Could anyone please help me how I can merge them as following?
REF is hg19 , ALT1 is first patient, ALT2 second patient ... so on till ALT6 sixth patient.
#CHROM   POS     REF   ALT1   ALT2   ALT3   ALT4   ALT5   ALT6
chrM     3       T     C      G      A      C      T      C
chrM     4       C     A      C      T      A      G      C
chrM     150     T     C      T      C      C      G      A
chrM     195     C     T      C      T      C      A      T
chrM     410     A     T      T      C      C      T      C
chrM     711     G     A      C      T      T      G      G
chrM     1890    G     .      C      T      C      A      C
chrM     2354    C     T      T      C      A      G      C
chrM     2485    C     T      A      G      G      A      C,T
chrM     3457    T     C      G      A      G      A      C
chrM     4162    C     T      T      A      T      C,A    A
chrM     4217    T     C      G      T      A      G      T
chrM     4918    A     G      C      .      G      A      A
chrM     5581    C     T      G      A      A      G      .
chrM     8698    G     A      G      A      A      C      A
chrM     8702    G     A      G      C      G      C      A
chrM     9378    G     A      C      T      G      A      C
chrM     9541    C     T      C      T      C      T      C
chrM     10284   A     G      G      A      A      C      C
chrM     10399   G     A      G      A      A      G      T
chrM     10464   T     C      C      G      T      C      G
chrM     10820   G     A      G      T      .      C      A
chrM     10874   C     T      G      T      G      C,T    G
chrM     11018   C     T      C      T      A      C      C
chrM     11252   A     G      .      C      G      A      T
chrM     11723   C     T      .      A      C      T      T
chrM     11813   A     G      G      A      C      A      C
Is that possible? I wrote period because someone told me you should have these periods if the positions there!
Thank you in advance
Bing
If I understand correctly you want to recode SNPs from ACTG to 0,1,2 ?
You can use plink. First convert VCF into plink format, then run plink --recode12. If you are more comfortable working with vcf, you can convert it back to VCF again
Thank you stolarek for you answer, yes you got what i want. I will try
Bing
Hi ebrown1955,
Thank you very much for your a great answer, I would like to show you what I got from first command (CombineVariants):
and from second command (
variantsToTable) is:could you please tell me what I should do now? I would give Dominant Homozygous 2 and recessive Homozygous 0 and give Heterozygous 1.
Thank you
Bing
You could write a Python program to do this for you. You'll have to parse each line one by one separate each genotype by "/" and check to see if it's homozygous or heterozygous. I have a script that tells if a genotype includes the alternative allele and can be modified to do what you'd like it to do.
Thank you ebown1955 for your help
Yes please, I would like to see that code if you do not mind!
To be honest I am not familiar with bioinformatics, this is first time dealing with SNP's data, and would to convert the data to 0,1,2 and 5 that I can use Regression Analysis.
Bing