I wonder if someone could suggest me a tool for manipulation with fasta files as well as for calculation of distances and making the phylogenetic trees in R.
My specific task is the following: I got the VCF recoded to multi-fasta file so the header corresponds to individual and the each SNP is presented by a nucleotide (in case if the nucleotide hasn't been read in the position it is N and in case of heterozygous site it is R, M, S, etc), the lengths of sequences is similar for each individual (in other words it is kind of already "aligned" fasta). Then I would like to perform the following manipulations: I want to upload the fasta as a dataframe so the individuals would be row names and the nucleotide will be present in column cells, so it would be possible to operate with them. For example: remove all heterozygous SNPs or positions with N etc. After that, I would like to calculate the distances (playing with methods here) between the samples and make an nj tree with bootstrap support.
I tried to do it with ape/phangorn but still with no success (I tried to load fasta a as dataframe to operate with it but failed), maybe my idea is totally wrong an I should choose another tool or approach. If somebody could suggest some tutorials I would be grateful.