Question: Mapping between gene IDs and transcripts IDs in C. elegans
0
gravatar for cristian
2.9 years ago by
cristian230
cristian230 wrote:

Dear Community,

I am looking to build a mapping file between the gene names of C. elegans and its transcript names. To do so, I use the Bioconductor packages biomaRt, that I freshly reinstalled. I have also freshed downloaded the latest transcriptome of C. elegans from Ensembl here: ftp://ftp.ensembl.org/pub/release-86/fasta/caenorhabditis_elegans/cdna/

Here is the code:

library(biomaRt)

Download C. elegans cDNA file from www.ensembl.org

download.file(paste0('ftp://ftp.ensembl.org/pub/release-', ensemblRelease, '/fasta/caenorhabditis_elegans/cdna/Caenorhabditis_elegans.WBcel235.cdna.all.fa.gz'), 'output/transcriptome/sequence/celegans.fa.gz') system('gunzip output/transcriptome/sequence/celegans.fa.gz')

Create a mapping file containing gene names in the first

column and the associated transcript name in the second

column. There should be only one name in each cell. Gene

names can occur more than once and be associated with more

than one associated transcript name but only one transcript

name per line.

martWorm <- biomaRt::useMart(biomart = "ENSEMBL_MART_ENSEMBL", dataset = "celegans_gene_ensembl", host = 'ensembl.org') g2t <- biomaRt::getBM(attributes = c('ensembl_gene_id', 'ensembl_transcript_id'), mart = martWorm) write.table(g2t, 'output/counts/rsem/ref/geneToTxMapping.txt', quote = FALSE, row.names = FALSE)

However, there is a problem. In my FASTA transcriptome (cDNA) file, I have the following transcript ID: F52H2.2. It is not found in my mapping table, although F52H2.2a and F52H2.2b are found. Vice-versa, F52H2.2a is not found in the FASTA file. This causes problems in my downstream analysis. Does anybody know what causes this? Is there a way maybe to download my transcriptome from within R using the biomaRt package that would make it compatible with its database?

Thank you.

ADD COMMENTlink written 2.9 years ago by cristian230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1698 users visited in the last hour