Question: How to get consensus sequences from vcf file for hetero SNP?
0
gravatar for agata88
3.8 years ago by
agata88800
Poland
agata88800 wrote:

Hi again,

How to divide vcf file into two files or call two consensus sequences from vcf file for hetero SNPs?

Best,

Agata

vcf consensus • 1.4k views
ADD COMMENTlink modified 3.7 years ago by Biostar ♦♦ 20 • written 3.8 years ago by agata88800
1

How do you plan to phase your variants per allele?

ADD REPLYlink written 3.8 years ago by WouterDeCoster44k

For example I have two snps in my vcf, one is hom TT and other is het GA. I would like to get two consensus sequences where one is with TG and second TA. Is there a tool that can do something like this?

ADD REPLYlink written 3.7 years ago by agata88800

Per @WouterDeCoster's question, what if the first is het AT and the second is het GA - how would you want to report it?

ADD REPLYlink written 3.7 years ago by harold.smith.tarheel4.5k

hom TT, het AT and another het GA I would like to report as: TAG and TTA assuming that TAG is from one read and TTA from another.

ADD REPLYlink written 3.7 years ago by agata88800

So the tools should also look into the bam file to find out which reads are supporting which variant calls? That's making the story more difficult, as you can imagine. What if there is no evidence to derive phase from the bam (no reads spanning) between position 1 (GA) and position 2 (TC)? How to report that?

ADD REPLYlink written 3.7 years ago by WouterDeCoster44k

I would discard those reads from analysis ...

ADD REPLYlink written 3.7 years ago by agata88800

This is getting weird. What do you really want to do, I mean, what's the goal of the analysis? Why do you want to split a vcf file without biological meaning?

ADD REPLYlink written 3.7 years ago by WouterDeCoster44k

I would like to perform HLA typing ... and wanted to follow this article:

https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-14-355

I might be wrong, I am getting confused here. They are using two "original own perl scripts". I am trying to write that. I have amplicons for whole genes not only for HLA but also KIR. I tested few pipelines which are "ready to use" and I obtained different results :/ So, I am trying to figure out the best solution.

ADD REPLYlink written 3.7 years ago by agata88800

Right, so you performed long range PCR of your targets of interest, followed by NGS library prep. It would make things much more clear if you would have stated that in your original post. Have you tried asking the authors for the perl script? In my opinion it is unacceptable not to publish such an important part of their work.

If this project is something you want to continue, you might want to consider getting a MinION and sequence the longe range PCR products directly without shearing.

ADD REPLYlink written 3.7 years ago by WouterDeCoster44k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1454 users visited in the last hour