Question: chromosome name in summarizeOverlaps for RNASeq data
0
gravatar for BioProg
3.7 years ago by
BioProg0
BioProg0 wrote:

 For a while I used quantile normalization followed by t-test or cuffdiff for analysis of RNASeq. I would like to try DESeq2 given its better normalization method based on the readings. I am using your R package summarizeOverlaps to create count tables as I found it to be the easiest. However, I am encountering couple of issues. First,the ensemble-based gtf file that I used to align all my mouse RNAseq data has the chromosomes named as “1, 2 , 3”… Rather than “chr1 , chr2 ..”. My Testing R code for summarizeOverlaps is as follows:

 

library(TxDb.Mmusculus.UCSC.mm10.ensGene)

TxDb_mm10=TxDb.Mmusculus.UCSC.mm10.ensGene

saveDb(TxDb_mm10, file="mm10.sqlite”)

mm10<-loadDb("mm10.sqlite”)

exbyGene<-exonsBy(mm10,by="gene”) #I assume this is assigning things by GENES ids (ENSG..)

exbyTx<-exonsBy(mm10,by="tx”) #I assume this is assigning things by Transcript IDS (ENST…)

 

fls <- list.files(“Path/To/BamFiles", pattern=".accepted_hits.bam", full= TRUE)

grp1<- fls[1]

grp2 <- fls[c(2,3)]

bamLst_grp1<-BamFileList(grp1,yieldSize=100000)

bamLst_grp2<-BamFileList(grp2,yieldSize=100000)

head(seqlevels(TxDb_mm10)) # this shows that the mm10 sqlite I am using has chr1, chr2 .. Format

seqinfo(bamLst_grp1) #this shows that the chromosome names for my testing bam files are 1, 2, 3…

se_grp1<-summarizeOverlaps(exbyGene,bamLst_grp1,mode="Union",singleEnd=FALSE,ignore.strand=TRUE,fragments=TRUE)

se_grp2<-summarizeOverlaps(exbyGene,bamLst_grp2,mode="Union",singleEnd=FALSE,ignore.strand=TRUE,fragments=TRUE)

##Saving environment

saveRDS(se_grp1, file="se_grp1.Rdata")

saveRDS(se_grp2, file="se_grp2.Rdata”)

 

While the script runs nicely (and please, advice me if I should optimize anything as I have basic understanding of each of the commands), when I tried to load the environment I am getting the following error:

 

Error: bad restore file magic number (file may be corrupted) -- no data loaded

In addition: Warning message:

file ‘se_grp2.Rdata’ has magic number 'X'

  Use of save versions prior to 2 is deprecated 

 

Also, I am not sure if the error is because the chromosome names are incompatible between the reference and bam file. If that’s the case, how can I correct for this? 

ADD COMMENTlink modified 3.7 years ago by Devon Ryan91k • written 3.7 years ago by BioProg0
1
gravatar for Devon Ryan
3.7 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

The error is unrelated to any chromosome names and has to do with R not being able to load the file it saved. Presumably there was a (hopefully transient) storage problem.

ADD COMMENTlink written 3.7 years ago by Devon Ryan91k

You are right I should have used readRDS as i saved file as RDS.

Thank you

ADD REPLYlink written 3.7 years ago by BioProg0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1744 users visited in the last hour