Question: BSgenome format error in subseq()
6.1 years ago by
United States
catherine12243 wrote:

Sorry if I'm asking a stupid question...

I'm trying to get sequences from a data frame "DINU" which contains chromosome name, start, and end positions. I have run it in R using windows before, and it works perfectly. But somehow because of the format of chromosome name, it doesn't work in Mac. Here is my scripts.

> subseq(DINU$Chr[i],start=200,end=400)
Error in .Call2("solve_user_SEW", refwidths, start, end, width, translate.negative.coord,  : 
  solving row 1: 'allow.nonnarrowing' is FALSE and the supplied start (200) is > refwidth + 1
> subseq(chr2L,start=200,end=400)
  201-letter "DNAString" instance
> DINU$Chr[i]
[1] "chr2L"

As its shown, the function cannot recognize chromosome name from the data frame, but the format is a character as it required in its description. 
> class(DINU$Chr[i])
[1] "character"

Thank you for any idea in advance

R • 2.2k views
R • 2.2k views
6.1 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

DINU$Chr[i] needs to be an XVector object, not a string. That's why you're getting the error.

Edit: I guess I should note that you can use a string, but then it needs to be the sequence. As is, you're passing in a 5 character string, which, as the error indicates, is shorter than the start coordinate. You might try subseq(DINU$Chr[i], start=1, end=5) if this is unclear. BTW, I'm guessing from your syntax that you're iterating over the dataframe in a for loop. You really don't want to do that in a functional language like R as the performance won't be good. Try apply() or just restructure things.

ADD COMMENTlink modified 6.1 years ago • written 6.1 years ago by Devon Ryan96k
