Question: CpG islands density calculation
0
gravatar for Lila M
2.0 years ago by
Lila M 470
UK
Lila M 470 wrote:

Hi everybody, I downloaded the promoter sequences fortwo gene list using USCS, so I have all the fasta files stored in a txt file (file_a and file_b). I would like to know if there is any difference for the CpG in both files. To do that, I've performed a little code in R

fastaFile_a = readDNAStringSet("file_a")
#seq_name_a = names(fastaFile_a)
#sequence_a = paste(fastaFile_a)
CG_file_a = sum(vcountPattern("CG", fastaFile_a))

fastaFile_b = readDNAStringSet("file_b")
CG_file_b =sum(vcountPattern("CG", fastaFile_b))

I'm not feel very confident at it, because I'm not sure the accuracy to identify CpG density properly... any idea or suggestion?

Thank!

cpg rna-seq promoters • 1.0k views
ADD COMMENTlink written 2.0 years ago by Lila M 470
1

Two quick notes:

You should probably normalise for lengths.

Are these sequences directional? Should you include an inverse pattern of "GC", given DNA is double stranded.

ADD REPLYlink written 2.0 years ago by jotan1.2k

Hi, as all the sequences have the same length (1,000 nt) I don't have to normalize for length. I downloaded the sequences for USCS, how can I know if they are directional? Thanks for the tips!

ADD REPLYlink written 2.0 years ago by Lila M 470
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1135 users visited in the last hour