My question concerns the processing of quite a few files in fasta or bed format in order to do some conservation analysis.
Basically I have about 50 human sequence files which span the entire genome grouped by some features. What I have planned, is to get some comparative genomics on other vertebrates ( for now I have dogs and fugu in mind).
My general idea was: LiftOver between my files and those species using the UCSC LiftOver tool but I was warned that this may cause problems since it was designed to transfer between human assemblies. But for now, while taking the results with a grain of salt, this is my first idea. After using liftOver I planned to use Clustal Omega to do multiple alignments between the sequences.
The UCSC Genome browser actually provides the information I want for all my bed intervals / fasta files when I enter an interval "by hand" - but since I have quite a lot of data this not practical of course.
My question therefore is, is my LiftOver approach actually useful or is there a pipeline / hands on tutorial somebody could point me to for the analysis of this kind of data. I am somewhat stuck at this point - Any help would be greatly appreciated.