ROI file for MuSic
Entering edit mode
8.3 years ago


I have question regarding ROI (region of interest file) for MuSic. When I compare ROI file provided by MuSic developers (Washington University). I found always starting point in MuSic ROI is 1 base before Ensemble GFF starting while end position 2 base after Enseble GFF end point.

Why there is difference in start and end position in MuSic ROI and Ensemble GFF. Is it necessary,I have to change starting and and position in my Enseble based ROI file (subtract 1 at starting and add 2 at end point position).

MuSic (Washington University) ROI file

1       11867   12229   DDX11L1
1       12611   12723   DDX11L1
1       13219   14411   DDX11L1
1       29552   30041   MIR1302-11
1       30265   30669   MIR1302-11
1       30364   30505   MIR1302-11

Ensemble GFF based ROI file

1       11868   12227   DDX11L1
1       12612   12721   DDX11L1
1       13220   14409   DDX11L1
1       12009   12057   DDX11L1
1       12178   12227   DDX11L1
1       12612   12697   DDX11L1
Music next-gen • 1.7k views
Entering edit mode
8.3 years ago

An ROI (regions of interest) file for MuSiC (described here) must use 1-based start and stop loci. If your ROIs are spliced exons, it is also recommended to add at least 2bp flanks on either side of each exon, to account for splice acceptors/donors. This checks out fine for the WashU ROI file you provided.

But your Ensembl GFF based ROI file appears to be a UCSC BED file of exon loci without 2bp flanks for splice sites, where the start loci are 0-based, while the stop loci are 1-based. A GFF must have 1-based start and stop loci, so something is wrong there.


Login before adding your answer.

Traffic: 919 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6