Question: Make fasta sequence of specific lengths from Vcf file of specific lengths
0
gravatar for shreyasibiswas88
21 months ago by
United States
shreyasibiswas8830 wrote:

Hi all,

I have a multi-sample vcf file of 350 loci spread across the genome from 85 individuals. I want to use this to ultimately build phylogenetic trees. For this i want to first make fasta sequences of 100kb length for each of the 350 loci for the 85 individuals, something similar to what GATK FastaAlternateRefrenceMaker does. I don't know how to specify sequence length in this tool though. Is it even possible to do what I am trying to do using this tool or any other tool? I would like some suggestions on how to take care of this.

Thank you.

ADD COMMENTlink modified 21 months ago by WouterDeCoster41k • written 21 months ago by shreyasibiswas8830

You could make the fasta files and then trim them to be of the length you want. 100kb sounds long if you are going to try and build phylogenetic trees.

ADD REPLYlink written 21 months ago by genomax73k
0
gravatar for WouterDeCoster
21 months ago by
Belgium
WouterDeCoster41k wrote:

I don't know if your approach for phylogenetic trees with fragments like that makes sense, but I'll leave that to others.

How I would handle your project:

  1. Use the VCF and GATK FastaAlternateRefrenceMaker to make a fasta file per individual
  2. Make a bed file for each locus (100kb per interval)
  3. Use bedtools getfasta to slice out your 100kb fragments from the fasta files
ADD COMMENTlink written 21 months ago by WouterDeCoster41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2609 users visited in the last hour