Question: Total CpG genome count
0
gravatar for rebekah_321
24 days ago by
rebekah_3210 wrote:

Hiya, I'm trying to get a total count of all CpGs present in a non-model organism genome. I have used bam2nuc in bismark to count dinucleotides - am I correct in thinking the count of CGs will be the total CpG count of the genome?

Best wishes,

Rebekh

cpg bismark rrbs methylation • 135 views
ADD COMMENTlink modified 23 days ago by maxime.policarpo20 • written 24 days ago by rebekah_3210

Couldn't you also count GCs nucleotides as it will be a CpG on the reverse strand ?

ADD REPLYlink written 23 days ago by maxime.policarpo20

I was meaning more is the programme suitable for this

ADD REPLYlink written 23 days ago by rebekah_3210

A GC is a GC on the other strand, not a CpG.

ADD REPLYlink written 23 days ago by Devon Ryan81k

A: How to find out total # of CpGs sites from a fasta file?

ADD REPLYlink written 23 days ago by Vijay Lakhujani2.7k

Cheers :) I'll give the C program a go!

ADD REPLYlink written 23 days ago by rebekah_3210

Hello rebekah_321,

Don't forget to follow up on your threads.

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.

Upvote|Bookmark|Accept

ADD REPLYlink modified 23 days ago • written 23 days ago by Vijay Lakhujani2.7k

Refer to Emboss programs : http://emboss.sourceforge.net/apps/cvs/emboss/apps/newcpgreport.html, http://emboss.sourceforge.net/apps/cvs/emboss/apps/cpgreport.html and http://emboss.sourceforge.net/apps/cvs/emboss/apps/newcpgseek.html

ADD REPLYlink modified 23 days ago • written 23 days ago by cpad01127.5k

I tried emboss - but doesn't it just report CpG islands?

ADD REPLYlink written 23 days ago by rebekah_3210

Yes I have used emboss to predict CpG islands - now I want all CpGs not just islands

ADD REPLYlink written 23 days ago by rebekah_3210

Yes I have used emboss to predict CpG islands - now I want all CpGs not just islands

ADD REPLYlink written 23 days ago by rebekah_3210

If you want to count only CGs from fasta (nt) in R:

library (Biostrings)
data(yeastSEQCHR1)
yeast1 <- DNAString(yeastSEQCHR1)
dinucleotideFrequency(yeast1)['CG']
 CG 
7089

in python:

s = 'ATATTGCGAAAGAACGTAATTTTATCGAAAAATCGATGTcgcgcg'
print(s.upper().count("CG"))

7
ADD REPLYlink modified 23 days ago • written 23 days ago by cpad01127.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 731 users visited in the last hour