Question

Batch conservation analysis

0

Entering edit mode

7.2 years ago

chrys ▴ 60

Hello Biostars,

My question concerns the processing of quite a few files in fasta or bed format in order to do some conservation analysis.

Basically I have about 50 human sequence files which span the entire genome grouped by some features. What I have planned, is to get some comparative genomics on other vertebrates ( for now I have dogs and fugu in mind).

My general idea was: LiftOver between my files and those species using the UCSC LiftOver tool but I was warned that this may cause problems since it was designed to transfer between human assemblies. But for now, while taking the results with a grain of salt, this is my first idea. After using liftOver I planned to use Clustal Omega to do multiple alignments between the sequences.

The UCSC Genome browser actually provides the information I want for all my bed intervals / fasta files when I enter an interval "by hand" - but since I have quite a lot of data this not practical of course.

My question therefore is, is my LiftOver approach actually useful or is there a pipeline / hands on tutorial somebody could point me to for the analysis of this kind of data. I am somewhat stuck at this point - Any help would be greatly appreciated.

Thank you,

conservation batch fasta genome • 1.6k views

ADD COMMENT • link 7.2 years ago by chrys ▴ 60

1

Entering edit mode

Have you tried downloading the relevant conservation tracks from UCSC table browser? You can download the entire genome and intersect with your regions of interest

ADD REPLY • link 7.2 years ago by Asaf 10k

0

Entering edit mode

I just downloaded them in *.maf format and I will try your proposed approach. I am just not quite sure how to handle the *.maf format together with bed/fasta files. Maybe you would have another tip ? I already have the entire genome downloaded for hg19.

Thanks for your help!

ADD REPLY • link 7.2 years ago by chrys ▴ 60

1

Entering edit mode

The easiest way would probably be to convert maf to vcf and then use vcftools to intersect it with bed files. See this thread: converting maf to vcf for conversion option.

ADD REPLY • link 7.2 years ago by Asaf 10k