gencode gtf file derived geneID can't be annotated to gene symbol following Deseq2 manual
1
0
Entering edit mode
3.1 years ago
Kai_Qi ▴ 130

Hi:

I used gencode GRCm38 GTF for annoation during reads counts using RSubread.

I can't get the gene symbol following the DEseq2 manual:

resTC$symbol <- mapIds(org.Mm.eg.db, keys = row.names(resTC), column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first")

I looked over it again and found that the geneID I got is in this format:

ENSMUSG00000029848.11

So I manually put it this geneID into NCBI gene and it does not match anything. However, if I use ENSMUSG00000029848 it will tell me it is "Stra8".

Can anyone tell me how to solve this problem?

Thank you very much,

R RNA-Seq gene • 1.0k views
ADD COMMENT
0
Entering edit mode

I got an answer from a previous post by others (https://www.biostars.org/p/301116/#496172).

But when I tried to ger rid of trail numbers using:

row.names(res) <- gsub(".*$" , "", row.names(res))

it turns out everything was replaced with "".

ADD REPLY
2
Entering edit mode
3.1 years ago
ATpoint 81k
gsub(\\..*", "", rownames(res))

Double backslash escapes the dot character (that is the first dot), and the second dot followed by wildcard means "remove everything after that character that was specified (which is \\. here).

ADD COMMENT
0
Entering edit mode

Thank you very much. It indeed removed the trailing numbers:

> head(row.names(res))
[1] "ENSMUSG000000519515" "ENSMUSG000001028511" "ENSMUSG000001033771" "ENSMUSG000001031471" "ENSMUSG000001023311"
[6] "ENSMUSG000001023481"

But when I typed:

> res$symbol <- mapIds(org.Mm.eg.db, keys = row.names(res), column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first")

I got

Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments.

The second question is on the gsub command:

in my early command, I have used it in this way:

head(colnames(countdata))
[1] "DIV0.1.bam" "DIV0.2.bam" "DIV0.3.bam" "DIV7.1.bam" "DIV7.2.bam" "DIV7.3.bam"
colnames(countdata) <- gsub(".bam" , "", colnames(countdata))
head(colnames(countdata))
[1] "DIV0.1" "DIV0.2" "DIV0.3" "DIV7.1" "DIV7.2" "DIV7.3"

It worked well. These differences got me a little bit confused.

Thanks you very much

ADD REPLY

Login before adding your answer.

Traffic: 1663 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6