Use R to find overlapping positions from transcription start sites
1
1
Entering edit mode
5.1 years ago
igperez ▴ 10

I am trying to use R to find the transcription factors Pdx1/NeuroD overlapping positions that are variable distances from transcription start sites. For the SELECT/WHERE query to work, you must SELECT data from two different Excel files. this is what I've attempted

promoter2 <- sqldf("SELECT pdx_neuroD6.Chr, pdx_neuroD6.neuroD.pos-1000, pdx_neuroD6.neuroD.pos+1000, promoter1.known_gene chrom, promoter1.tx_start from pdx_neuroD6 AND promoter1 WHERE Chr = known_gene chrom AND tx_start BETWEEN NeuroD.pos-1000 AND neuroD.pos+1000")

this is the error

Error in result_create(conn@ptr, statement) : near "AND": syntax error

R sql • 876 views
ADD COMMENT
2
Entering edit mode
5.1 years ago
zx8754 11k

If you format your SQL code, then it is easy to spot why this is happening. You are using AND instead of a comma (,) between the table names. Also you have spaces in column names, so we need to wrap them into square brackets ([...]):

sqldf("
SELECT pdx_neuroD6.Chr,
       pdx_neuroD6.neuroD.pos-1000,
       pdx_neuroD6.neuroD.pos+1000,
       promoter1.[known_gene chrom],
       promoter1.tx_start 
FROM pdx_neuroD6, promoter1 
WHERE Chr = [known_gene chrom] AND 
      tx_start BETWEEN NeuroD.pos-1000 AND neuroD.pos+1000
")
ADD COMMENT

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6