Sequence Conservation/Multiple alignment
2
0
Entering edit mode
10.2 years ago
Diana ▴ 910

Hi all!!

I have histone chip-seq data (for chicken, assembly galgal4) from which I'm trying to identify active/repressed enhancers of genes. Right now the criterion for selecting the enhancers is to select regions within 50kb of a gene that are flanked by H3K27ac peaks. I'm also interested in looking at conservation of sequence so we can prioritize regions that are conserved across different species. From UCSC, there is no sequence conservation information for galgal4 although there is for old assemblies. Is it possible to get such information and incorporate as tracks so it can be viewed in a browser like UCSC or IGB along with chip-seq peaks?

Thanks a lot!!!

alignment • 4.5k views
ADD COMMENT
2
Entering edit mode
10.2 years ago

A shot in the dark, the conservation track tells you how the specific gene segment or stretch of DNA was conserved among the different species in questions. So, if galgal3 has a conservation track, you should simple be able to liftOver that and use it for your purposes, until new homology/convervation analysis is done.

http://moma.ki.au.dk/genome-mirror/cgi-bin/hgTrackUi?db=anoCar2&g=cons7way

EDIT

Liftover of conservation track (mutli7z) from galgal3 to galgal4

multi7z This track shows a measure of evolutionary conservation in 7 vertebrates, including bird, mammalian, amphibian, and fish species, based on a phylogenetic hidden Markov model, phastCons (Siepel et al., 2005). Multiz alignments of the following assemblies were used to generate this track:

chicken (May 2006 (WUGSC 2.1/galGal3), galGal3) human (Mar 2006, hg18) mouse (Feb 2006, mm8) rat (Nov 2004, rn4) opossum (Jan 2006, monDom4) X. tropicalis (Aug 2005, xenTro2) zebrafish (Mar 2006, danRer4)

Download the multi7z track from galgal3 ucscftp

http://hgdownload.cse.ucsc.edu/goldenPath/galGal3/database/multiz7way.txt.gz

Gunzip and convert it to bed,

gunzip -c multiz7way.txt.gz | cut -f2-7 > multiz7way.bed

Download the liftOver chain file,

http://hgdownload.cse.ucsc.edu/goldenPath/galGal3/liftOver/galGal3ToGalGal4.over.chain.gz

Download the liftOver utility for your system architecture, http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/liftOver

Then execute,

liftOver multiz7way.bed galGal3ToGalGal4.over.chain.gz multiz7way_mapped.bed multiz7way_unmapped.bed

multiz7way_mapped.bed is your file, convert it to bedGraph (coverage file) or bam/wig/bigwig, whatever you like and upload as a custom track to ucsc.

Please confirm the efficacy of this answer, before using it in downstream research.

Cheers

ADD COMMENT
0
Entering edit mode

But how do I get the conserved coordinates and liftover to galgal4? Is it possible to use multiz alignment to align the genomes you wnat yourself?

Thanks a lot!!

ADD REPLY
0
Entering edit mode

I updated my answer.

ADD REPLY
0
Entering edit mode

Thanks a lot Sukhdeep for writing step-by-step instructions. It was very useful!!

ADD REPLY
0
Entering edit mode

Hi Sukhdeep, Thanks a lot for your answer. Im confused as to what the different columns are. The 2nd, 3rd and 4th columns are coordinates in mutiz7way.txt.gz. What are 5th, 6th and 7th columns? Which column is the conservation score? and how do I retain the conservation score after liftOver to Galgal4. Is that possible? because I notice after liftOver the last column becomes zeroes throughout. Thanks a lot!!!

ADD REPLY
0
Entering edit mode

http://moma.ki.au.dk/genome-mirror/cgi-bin/hgTables?db=galGal3&hgta_track=multiz7way&hgta_table=multiz7way&hgta_doSchema=describe+table+schema

For second question,

Use

liftOver -bedPlus=5 multiz7way.bed galGal3ToGalGal4.over.chain.gz multiz7way_mapped.bed multiz7way_unmapped.bed

Cheers

ADD REPLY
1
Entering edit mode
ADD COMMENT
0
Entering edit mode

Thanks a lot Emily!! I'm interested in getting the PECAN analysis files as chicken coordinates so I know which regions are conserved and so I can upload these as tracks in any browser. Is that possible to download from Ensembl?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks a lot Emily!!!

ADD REPLY
0
Entering edit mode

Thank you so much for your prompt reply Emily...I could not find some information on PECAN 21-way alignment...I am interested to find out that for a conserved region to be present in chicken (Galgal4), what is the minimum no. of other genomes this region should match to qualify as conserved? Is it 2?or greater? Thanks a lot!

ADD REPLY
0
Entering edit mode

Things are classed as conserved if they match in more than half of the alignments.

ADD REPLY
0
Entering edit mode

Hi Emily!!! I was wondering if with the conserved elements for Galgal4, it is also possible to get corresponding conservation scores. I want to plot the conservation scores around my enhancer regions and it would be quite good to have actual scores or some kind of quantification. Is that possible with PECAN 21-way alignment? and where can I find such files? Thanks a lot!!!

ADD REPLY
0
Entering edit mode

Scores are available, but there isn't one file of them. You can view the scores in the browser, or you can get them via the Perl APIs.

ADD REPLY

Login before adding your answer.

Traffic: 2346 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6