What Is The Easiest Way To Get To 5'Cpg3' Island Percentage In All Exons Of A Specific Gene?
2
0
Entering edit mode
12.7 years ago
Omid ▴ 580

what is the easiest way to get to 5'CpG3' island percentage in all exons of a specific gene?

exon • 2.9k views
ADD COMMENT
1
Entering edit mode

Could you rephrase the question?

ADD REPLY
2
Entering edit mode
12.7 years ago
brentp 24k

I put a script here that uses the UCSC mysql server to grab the exons for your gene and then calculate the coverage from the cpgIslandExt table

Usage is

$ python exon-cpg.py [genome] [gene]

e.g.

$ python exon-cpg.py hg19 gata3

gives:

exon-start  exon-end    coverage    cpg:pct,...
8096666 8096854 1.0 CpG::60.1
8097249 8097859 1.0 CpG::60.1
8100267 8100804 0.715083798883  CpG::71.4
8105955 8106101 0   
8111435 8111561 0   
8115701 8117164 0

where the final column is a comma-delimited list of cpg-name:pct-gc

You can verify this is correct by looking at Gata3 in the browser

ADD COMMENT
0
Entering edit mode

note it's making 1 sql query for each exon, so it's very inefficient if you intend to use it in batch.

ADD REPLY
0
Entering edit mode
12.7 years ago
Alex ★ 1.5k
  1. Find your specific gene in NCBI's Gene database (e.g. GNAS complex locus) and open gene description page.
  2. Find NCBI Reference Sequence (RefSeq) section, find Genomic subsection and open gene DNA sequence in GenBank format from Nucleotide DB GNAS example.
  3. Find GI in the Version section (281182534 GNAS)
  4. Find the Features section and save all coordinates of exons.
  5. Get all exons sequences with Biopython and my code example.
  6. Compute percentage of CG pairs with this

Python code:

exons = [exon_seq1, exon_seq2, exon_seq3]
for exon in exons:
    exon = exon.lower()
    CpG_number = exon.count("cg")
    CpG_proc = CpG_number * 2 * 100.0 / len(exon)
ADD COMMENT

Login before adding your answer.

Traffic: 2908 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6