Question: Tool to check what alternate allele is dominant across samples per line of the VCF file.
0
gravatar for halo22
5 months ago by
halo22150
Indianapolis, IN
halo22150 wrote:

Hello All,

I am very new to WGS analysis. I have a multisample VCF file that I have annotated using snpEFF. I wanted to see if I can find what alternate alleles are conserved between samples for each genomic location. For eg: Chr1 pos: 1001, has a reference A and the alternate allele seen are T, AAT, TTT, AA and there are 10 samples. I want to count what samples have TTT, AA and so on. This way I can understand what allele is dominant across samples for each position.

All help is appreciated

Thanks.

next-gen wgs • 181 views
ADD COMMENTlink modified 5 months ago by chrchang5237.1k • written 5 months ago by halo22150
0
gravatar for Carambakaracho
5 months ago by
Carambakaracho2.2k
Germany/Cologne
Carambakaracho2.2k wrote:

I made very good experiences with the bioconductor vcfR package.

In case you're new to R, too and this is a one-off project, there's nothing wrong to just use Excel and multiple text-to-column operations to split the data (provided your machine is powerful enough to handle it). It's a bit tedious, but the learning curve is less steep

ADD COMMENTlink written 5 months ago by Carambakaracho2.2k

there's nothing wrong to just use Excel

enter image description here

ADD REPLYlink written 5 months ago by Pierre Lindenbaum129k
1

:-D damn it, I got the excel shame AND didn't realise the thread was 20 days old.

@halo22 you better not use my excel advise and try to hire with Pierre I guess.

@pierre or anyone as this is most likely irrelevant for the OP anyway, does biostars feature strikeout markdown?

Cheers

ADD REPLYlink modified 5 months ago • written 5 months ago by Carambakaracho2.2k

Thank you, guys! I wrote my own to get this done.

ADD REPLYlink written 5 months ago by halo22150
0
gravatar for chrchang523
5 months ago by
chrchang5237.1k
United States
chrchang5237.1k wrote:

With plink 2.0,

plink2 --vcf <VCF path> --freq counts

gets this information for you. (Remove 'counts' if you want proportions instead.)

ADD COMMENTlink written 5 months ago by chrchang5237.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1257 users visited in the last hour