Trying to extract gene_type from GeneCode GTF file
1
1
Entering edit mode
4.4 years ago

Hi, I'm trying to filter my analysis to only specific gene types ( protein coding, link, ribosomal, rRNA). I'm starting from RNA seq counts. The GTF file is downloaded from: ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_24/gencode.v24.annotation.gtf.gz Here is the code I have until now, and I'd like to add "gene_type" column to the output file:

getbiocGet("GenomicRanges")
biocGet("rtracklayer")
biocGet("genoset")

library(GenomicRanges)
library(rtracklayer)
library(genoset)

GTFfile = "./gencode.v24.annotation.gtf" 
GTF <- import.gff(con=GTFfile , format="gtf", genome="GRCh38.p5", feature.type="exon")

#Load the annotation and reduce it to get gene length
#GTF <- import.gff(GTFfile, format="gtf", genome="GRCm38.71", asRangedData=F, feature.type="exon")
grl <- reduce(split(GTF, elementMetadata(GTF)$gene_id))
reducedGTF <- unlist(grl, use.names=T)
elementMetadata(reducedGTF)$gene_id <- rep(names(grl), elementNROWS(grl))
elementMetadata(reducedGTF)$widths <- width(reducedGTF)
calc_length <- function(x) {
  sum(elementMetadata(x)$widths)
}
output <- sapply(split(reducedGTF, elementMetadata(reducedGTF)$gene_id), calc_length)
colnames(output) <- c("Length")

Thanks!

rna-seq R • 2.1k views
ADD COMMENT
0
Entering edit mode
4.3 years ago

I think you are making it hard for yourself by trying to process the GTF file in R. I would either download the table of biotypes from the ensembl website https://www.ensembl.org/biomart/martview and then load it into R.

or use the biomaRt Bioconductor package to fetch the table directly into R.

or if you REALLY wanted to process it from the GTF file, you can get it from the "gene_biiotype" column of the elementMetadata.

So the following code would fetch a table that had one column of gene_ids and another containing gene_biotype:

biotype_table <- elementMetadata(GTF)[ , c("gene_id", "gene_biotype")]
biotype_table <- biotype_table[!duplicated(biotype_table)]
ADD COMMENT

Login before adding your answer.

Traffic: 2571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6