Question: Convert SNP data to 0,1,2 and 5
0
gravatar for bingnas
2.6 years ago by
bingnas10
United States
bingnas10 wrote:

Hi there,

I am looking for hire someone for reasonable price

I have bam files for 22 subjects (human) mapped by Bowtie2 with hg-19.

1- I want SNP data vs reference genome (i.e hg19) from these samples. 2- Convert SNP genotype to 0,1,2 and 5. Where 0 is recessive homozygous and 2 dominant homozygous, 1 hetrozigous and 5 for missing. 3- Merge these 22 subjects in matrix as following:

Chromosome postion reference Subject1 Subject2 ……………………. Subject22 Ch1 335453 A 0 2 ...……………………. 0 Chr1 336565 G 1 5 ……...………………. 2 . .

. . Ch22 3546372 C 1 0 ……….....…………… 1

enter image description here

Thanks

snp • 1.5k views
ADD COMMENTlink modified 2.5 years ago by Jorge Amigo11k • written 2.6 years ago by bingnas10
1

I assume you mean 0: reference homozygous, 1: heterozygous variant and 2: homozygous variant. Dominant and recessive doesn't make sense on the variant level. A variant can have a dominant/recessive effect on a phenotype, but it's not a variant state.

The job you are asking for is quite easy.

ADD REPLYlink written 2.6 years ago by WouterDeCoster39k

Thank you WouterDeCoster for your answer! could you help me how to do it please or I would send you the data?

Bing

ADD REPLYlink written 2.6 years ago by bingnas10
1

I assume this is whole exome sequencing data or whole genome sequencing data. The gatk best practices are quite well documented and commonly accepted way of doing data processing and variant calling. You will obtain vcf files after variant calling, which can be converted to the numerical output (plink format, right?) you ask for using vcftools ./vcftools --vcf input_data.vcf --plink --chr 1 --out output_in_plink

ADD REPLYlink written 2.6 years ago by WouterDeCoster39k

yes I want it like PLINK format, I see you put --chr 1, you mean I should convert them by chromosome? in other word can I convert whole chromosomes in one time?

I will do it and let you know what is going on!

Thank you for your help

ADD REPLYlink written 2.6 years ago by bingnas10
1

According to https://vcftools.github.io/man_latest.html (see SITE FILTERING OPTIONS) that is just a method to filter the file by inclusion or exclusion of a certain chromosome and the command I posted is just an example I copy pasted from the documentation. It's probably not an essential argument to the function.

ADD REPLYlink written 2.6 years ago by WouterDeCoster39k
1
gravatar for Jorge Amigo
2.5 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

you are trying to create an input file for plink, but all you need to do is to perform variant calling on your samples and give the resulting vcf files directly to plink, since latest plink versions do accept vcf files natively.

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by Jorge Amigo11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1450 users visited in the last hour