Make fasta sequence of specific lengths from Vcf file of specific lengths
1
0
Entering edit mode
6.3 years ago

Hi all,

I have a multi-sample vcf file of 350 loci spread across the genome from 85 individuals. I want to use this to ultimately build phylogenetic trees. For this i want to first make fasta sequences of 100kb length for each of the 350 loci for the 85 individuals, something similar to what GATK FastaAlternateRefrenceMaker does. I don't know how to specify sequence length in this tool though. Is it even possible to do what I am trying to do using this tool or any other tool? I would like some suggestions on how to take care of this.

Thank you.

GATK FastaReference VCF SNP Phylogenetic tree • 1.9k views
ADD COMMENT
0
Entering edit mode

You could make the fasta files and then trim them to be of the length you want. 100kb sounds long if you are going to try and build phylogenetic trees.

ADD REPLY
0
Entering edit mode
6.3 years ago

I don't know if your approach for phylogenetic trees with fragments like that makes sense, but I'll leave that to others.

How I would handle your project:

  1. Use the VCF and GATK FastaAlternateRefrenceMaker to make a fasta file per individual
  2. Make a bed file for each locus (100kb per interval)
  3. Use bedtools getfasta to slice out your 100kb fragments from the fasta files
ADD COMMENT

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6