Create custom annotation for Chinese hamster (Cricetulus griseus)
1
0
Entering edit mode
2.5 years ago
chansik ▴ 10

Hi, I'm new to both learning and running NGS data analysis tools.

Most of the packages in bioconductor require same format of annotation.

But looking at [AnnotationData package],1 there is no public annotation package for Cricetulus griseus.

I want to create a new annotation for Cricetulus griseus (CriGri-PICR)(CriGri-PICR can be accessed in Ensembl, not in UCSC genome browser). If I have to create a custom annotation for bioconductor usage, can you give an advice for creating a new annotation?

Thanks

ChipQC R ChIPseeker Bioconductor • 1.5k views
ADD COMMENT
0
Entering edit mode
2.5 years ago
Papyrus ★ 2.9k

The GenomicFeatures package can help you here. Going to the link you provided and downloading the GTF annotation file, you can do the following to get a TxDb object:

library(GenomicFeatures)
txdb <- makeTxDbFromGFF(file = "Cricetulus_griseus_picr.CriGri-PICR.104.gtf.gz")
ADD COMMENT
0
Entering edit mode

Hello Papyrus.

Thank you so much for your response and detailed explanation! It is really helpful for me to learn about that!

I'd like to ask one more question if it doesn't bother you.

If I got this message:

txdb <- makeTxDbFromGFF(file = "Cricetulus_griseus_picr.CriGri-PICR.104.gtf.gz") Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK In .get_cds_IDX(mcols0$type, mcols0$phase) : The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored. 'select()' returned 1:many mapping between keys and columns

Is it okay to just ignore it or should I use gff3 instead of gtf?(Running the same code with gff3.gz seems working well!)

ADD REPLY
0
Entering edit mode

The warning messages aren't usually a problem, and select() indicates that some annotations may have multiple matches. I personally use GTF files. Both files are usually similar/equivalent. You can check the agreement between the both files by looking at the GRanges generated, for example from doing transcripts(gtf). (It is possible that the files will not completely 100% agree).

ADD REPLY
0
Entering edit mode

OK. Papyrus. Thank you a lot for your comments.

But there is another problem that is occurred in running ChipQC about annotation.

Here is the code that I ran:

## Load libraries
library(ChIPQC)
library(GenomicRanges)
## load CriGri-PICR annotation
library(GenomicFeatures)
txdb <- makeTxDbFromGFF(file = "Cricetulus_griseus_picr.CriGri-PICR.104.gtf.gz")

## Load sample data
samples <- read.csv('meta/CHO_chipseq.csv')
View(samples)

## Create ChIPQC object
chipObj <- ChIPQC(samples, annotation=txdb) 
plotRegi(chipObj)

Checking chromosomes: [1] "RAZU01000001.1" Compiling annotation...

Error:

Compiling annotation...

Error in GeneAnnotation == "hg19" :

comparison (1) is possible only for atomic and list types

Can you please help me fix the error?

Thanks,

ADD REPLY
0
Entering edit mode

Well, I have never used that package in particular. It is possible that you will have to transform the input annotation to the format specified by the package, or change some default arguments. I think these posts may be of help: one, two, but maybe others can chime in.

ADD REPLY

Login before adding your answer.

Traffic: 3373 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6