CpG islands density calculation
0
0
Entering edit mode
4.2 years ago
Lila M ▴ 920

Hi everybody, I downloaded the promoter sequences fortwo gene list using USCS, so I have all the fasta files stored in a txt file (file_a and file_b). I would like to know if there is any difference for the CpG in both files. To do that, I've performed a little code in R

fastaFile_a = readDNAStringSet("file_a")
#seq_name_a = names(fastaFile_a)
#sequence_a = paste(fastaFile_a)
CG_file_a = sum(vcountPattern("CG", fastaFile_a))

fastaFile_b = readDNAStringSet("file_b")
CG_file_b =sum(vcountPattern("CG", fastaFile_b))

I'm not feel very confident at it, because I'm not sure the accuracy to identify CpG density properly... any idea or suggestion?

Thank!

RNA-Seq CpG promoters • 1.9k views
ADD COMMENT
1
Entering edit mode

Two quick notes:

You should probably normalise for lengths.

Are these sequences directional? Should you include an inverse pattern of "GC", given DNA is double stranded.

ADD REPLY
0
Entering edit mode

Hi, as all the sequences have the same length (1,000 nt) I don't have to normalize for length. I downloaded the sequences for USCS, how can I know if they are directional? Thanks for the tips!

ADD REPLY

Login before adding your answer.

Traffic: 1970 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6