wrong output for Human GENE in R
1
0
Entering edit mode
7.7 years ago

Hi

I'm trying to get the sequence of the gene for human. so in R I attached library BSgenome.Hsapiens.UCSC.hg19 and then when I use this command BSgenome.Hsapiens.UCSC.hg19$chr1 I just faced with something like this: NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN  Where is the problem? Sorry for my poor English Thanks mansoor R gene • 1.2k views ADD COMMENT 0 Entering edit mode Thank u Devon, but I have another question what is different between your code and mine?! seq=BSgenome.Hsapiens.UCSC.hg19$chr1[1e7:10000100]

0
Entering edit mode

They may produce the same results, mine just uses the accessors rather than coercing datatypes. Using the accessors has the advantage of some error checking and dealing with strands.

Edit: Using the accessors also has the benefit that you don't rely on the underlying class maintaining its representation. An accessor is going to get updated with the package, even if the class structure provided in the package changes radically.

2
Entering edit mode
7.7 years ago

chr1 in hg19 starts with a bunch of Ns, so it's unlikely that you did anything wrong. BTW, you may find it simpler to make a GRanges object and then use getSeq().

gr <- GRanges(seqnames="chr1", ranges=IRanges(start=1e7, end=10000100))
getSeq(BSgenome.Hsapiens.UCSC.hg19, gr)
A DNAStringSet instance of length 1
width seq
[1]   101 AACCCCGTCTCTACAATAAATTAAAATATTAGCT...TTGGCGGGCTGAGGTGGGAGAATCATCCAAGCC