Question: How to define intergenic regions from cufflinks .gtf-file (separate introns from intergenic)
0
gravatar for jon.brate
5.1 years ago by
jon.brate250
Norway
jon.brate250 wrote:

It is easy to define intergenic regions if the gff/gtf file contains a "gene" line (see: http://davetang.org/muse/2013/01/18/defining-genomic-regions/). But I am using a gtf-file generated by cufflinks, and it only uses exon lines. How can I separate the intronic from the intergenic regions based on such a file?

intergenic cufflinks gtf • 2.4k views
ADD COMMENTlink modified 5.1 years ago by Devon Ryan97k • written 5.1 years ago by jon.brate250
1
gravatar for Devon Ryan
5.1 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

The general steps are as follows:

  1. Load GTF file into R (see the rtracklayer and GenomicRanges packages). This should result in a GRanges object.
  2. split() the result of step 1 by gene_id. You now have a GRangesList.
  3. lapply() a function to return a data.frame containing the following: chromosome, min(start), max(end).
  4. Convert the result of 3 to a GRanges object.
  5. reduce() the result of step 4.
  6. Run gaps() on the result of step 5. Congrats, you're done!

I have a script somewhere that does all of that, but I've sketched out enough to get you started.

ADD COMMENTlink written 5.1 years ago by Devon Ryan97k

Thanks for the advice!

ADD REPLYlink modified 5.1 years ago • written 5.1 years ago by jon.brate250

I can make lists of the start and end positions of each gene by e.g.: gene.start = lapply(object, start)

But I don't quite understand how to get the chromosome names. I tried lapply(object, seqnames) and seqnames(object) but how to combine with the start and end coordinates?

Edit: I found a sligthly different solution here: https://support.bioconductor.org/p/66003/

gtf = makeTxDbFromGFF("mygtf.gtf", format = "gtf")

gene = exonsBy(gtf, "gene")
intergenic = gaps(unlist(range(gene)))
export.gff(intergenic, "intergenic.gff", format="gff")
ADD REPLYlink modified 5.1 years ago • written 5.1 years ago by jon.brate250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1909 users visited in the last hour