Question: Dendrogram on SNPs from VCF file?
0
gravatar for popescuiofelia
8 months ago by
London, UK
popescuiofelia10 wrote:

Hi! I have tried making a dendrogram from a VCF file that contains SNP data for 20 samples. First, I tried the SNP relate software, but it was excluding all the SNPs

    SNP pruning based on LD:
Excluding 0 SNP on non-autosomes
Excluding 78,187 SNPs (monomorphic: TRUE, MAF: 0.1, missing rate: 0.1)
Working space: 20 samples, 0 SNP
    using 1 (CPU) core
    sliding window: 500,000 basepairs, Inf SNPs
    |LD| threshold: 1
    method: composite
0 markers are selected in total.

Then I have tried the SNPphylo package but it is also based on SNPrelate and I get the same results. Does anyone know a quick solution to getting a dendrogram from a vcf file?

I have also tried making it a fasta file with gatk and tried FastTree but I get this error:

    FastTree Version 2.1.10 Double precision (No SSE3)
Alignment: trial.fasta
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Jukes-Cantor, CAT approximation with 20 rate categories
Wrong number of characters for 1: expected 8477918 but have 55489 instead.
This sequence may be truncated, or another sequence may be too long.

Also, I don't mind using fasta aglinment files but can anyone explain me how I can get a tree from whole genome sequence files (I have bam files)?

Thanks

snprelate fasttree snphylo • 375 views
ADD COMMENTlink modified 5 months ago by sankar200410 • written 8 months ago by popescuiofelia10

With a VCF file, you could input it to PLINK, where you could then perform IBS clustering. The output would simply be read into R and then generated into a tree. https://www.cog-genomics.org/plink/1.9/strat

ADD REPLYlink written 7 months ago by Kevin Blighe49k
0
gravatar for sankar2004
5 months ago by
sankar200410
sankar200410 wrote:

You can use the simple one-click software VCF2POPTREE directly using a VCF file containing genotypes from multiple individuals or populations

ADD COMMENTlink modified 5 months ago • written 5 months ago by sankar200410
1

Please host your software on GitHub or a similar Git website. Custom domains with just skeletal download links and cloud storage services are not advisable as there is no way we can check out your software without risking our machines.

Also, you can edit your posts. Do not add new answers when the right thing to do is editing your existing answer.

ADD REPLYlink modified 5 months ago • written 5 months ago by RamRS24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1715 users visited in the last hour