Question: Making customized DB using customProDB (R package)
gravatar for d88020
23 months ago by
d8802010 wrote:

I'm trying to make customized DB using customProDB. I have tried to splice junction analysis and made annotation files (splicemax, ids, etc.) thanks to tutorial. However, OutputNovelJun function made error. I copied and paseted code and message below.

> library(customProDB)

필요한 패키지를 로딩중입니다: IRanges

필요한 패키지를 로딩중입니다: BiocGenerics

필요한 패키지를 로딩중입니다: parallel

다음의 패키지를 부착합니다: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’: IQR, mad, xtabs

The following objects are masked from ‘package:base’: anyDuplicated, append,, cbind, colnames,, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste, pmax,, pmin,, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which, which.max, which.min

필요한 패키지를 로딩중입니다: S4Vectors

필요한 패키지를 로딩중입니다: stats4

다음의 패키지를 부착합니다: ‘S4Vectors’

The following objects are masked from ‘package:base’: colMeans, colSums, expand.grid, rowMeans, rowSums

필요한 패키지를 로딩중입니다: AnnotationDbi

필요한 패키지를 로딩중입니다: Biobase

Welcome to Bioconductor

Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'.

필요한 패키지를 로딩중입니다: biomaRt

Gene2RefSeq <- read.csv("C:/Users/Admin/Desktop/CustomizedDB/Gene2RefSeq_Parsed.txt", sep = "\t")
pepfasta <- "C:/Users/Admin/Desktop/CustomizedDB/customProDB/pepfasta.fasta"
CDSfasta <- "C:/Users/Admin/Desktop/CustomizedDB/customProDB/CDSfasta.fasta"
annotation_path <- getwd()
transcript_ids <- as.matrix(Gene2RefSeq[ , 2])
transcript_ids <- transcript_ids[1:1000]
PrepareAnnotationRefseq(genome='hg19', CDSfasta, pepfasta, annotation_path, transcript_ids = transcript_ids, splice_matrix = TRUE)

Build TranscriptDB object (txdb.sqlite) ...

Download the refGene table ... OK

Download the hgFixed.refLink table ... OK

Extract the 'transcripts' data frame ... OK

Extract the 'splicings' data frame ... OK

Download and preprocess the 'chrominfo' data frame ... OK

Prepare the 'metadata' data frame ... OK

Make the TxDb object ... OK


Prepare gene/transcript/protein id mapping information (ids.RData) ... done

Prepare exon annotation information (exon_anno.RData) ... done

Prepare protein sequence (proseq.RData) ... done

Prepare protein coding sequence (procodingseq.RData)... done

Prepare exon splice information (splicemax.RData) ... done

There were 16 warnings (use warnings() to see them)


필요한 패키지를 로딩중입니다: GenomicFeatures

필요한 패키지를 로딩중입니다: GenomeInfoDb

필요한 패키지를 로딩중입니다: GenomicRanges

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
bedfile <- "C:/Users/Admin/Desktop/CorrectedSpliceJunctions_Colon.bed"
jun <- Bed2Range(bedfile,skip=1,covfilter=5)
junction_type <- JunctionType(jun, splicemax, txdb, ids)
outf_junc <- paste(getwd(), '/test_junc.fasta',sep='')

필요한 패키지를 로딩중입니다: BSgenome

필요한 패키지를 로딩중입니다: Biostrings

필요한 패키지를 로딩중입니다: XVector

필요한 패키지를 로딩중입니다: rtracklayer

proteinseq <- read.csv2("C:/Users/Admin/Desktop/CustomizedDB/proteinseq.txt", sep = "\t")
OutputNovelJun <- OutputNovelJun(junction_type, Hsapiens, outf_junc, proteinseq)

Error in loadFUN(x, seqname, ranges) :

trying to load regions beyond the boundaries of non-circular sequence "chr17"

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

[1] LC_COLLATE=Korean_Korea.949  LC_CTYPE=Korean_Korea.949    LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C                
[5] LC_TIME=Korean_Korea.949    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0       BSgenome_1.42.0                        
 [3] rtracklayer_1.34.2                      Biostrings_2.42.1                      
 [5] XVector_0.14.1                          TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [7] GenomicFeatures_1.26.4                  GenomicRanges_1.26.4                   
 [9] GenomeInfoDb_1.10.3                     customProDB_1.14.1                     
[11] biomaRt_2.30.0                          AnnotationDbi_1.36.2                   
[13] Biobase_2.34.0                          IRanges_2.8.2                          
[15] S4Vectors_0.12.2                        BiocGenerics_0.20.0                    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.10               magrittr_1.5               zlibbioc_1.20.0            GenomicAlignments_1.10.1  
 [5] BiocParallel_1.8.2         lattice_0.20-35            plyr_1.8.4                 stringr_1.2.0             
 [9] tools_3.3.3                grid_3.3.3                 SummarizedExperiment_1.4.0 DBI_0.6-1                 
[13] digest_0.6.12              Matrix_1.2-8               bitops_1.0-6               RCurl_1.95-4.8            
[17] memoise_1.1.0              RSQLite_1.1-2              stringi_1.1.5              Rsamtools_1.26.2          
[21] XML_3.98-1.6               VariantAnnotation_1.20.3
R • 1.0k views
ADD COMMENTlink modified 23 months ago by Matt Chambers20 • written 23 months ago by d8802010
gravatar for Matt Chambers
23 months ago by
Matt Chambers20 wrote:

You should be loading the proteinseq from your annotation directory, i.e.


That will load the proteinseq annotations into your environment. Then you can pass that to OutputNovelJun. It's one of the trickier functions to get working so I wouldn't be surprised if you run into more issues with it after this one is fixed.

ADD COMMENTlink written 23 months ago by Matt Chambers20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1102 users visited in the last hour