I had asked this question in a message but should have posted it here; with answer from Cyriac.
Thanks for your post about pathway file generation. We are having great luck with our data now. We have been using the "ensembl_67_cds_ncrna_and_splice_sites_hg19" as our region of interest for most of the MuSiC suite. We had sequenced 60x coverage of our tumors, whole genome. We have been trying to make an 'intergenic' ROI file in order to look at the rest of our data; two thoughts were to make a ROI that goes from base 1 to n of each chromosome, the other was to make a file that spanned the regions between the exons of the file listed above. Is this a reasonable investigation, or is MuSiC designed more to look at exome data?
Thank you, David
PM from Cyriac Kandoth:
Hi david. You can post your question to Biostar. I'm sure the answer will be of interest to other users as well.. In short: yes, it's a worthwhile investigation. We have previously used MuSiC's SMG test to find significantly altered non-coding regions... with mixed results, but interesting anyhow.
You can download intron loci in GTF format from Ensembl. Here is their latest from release 72. Look for the Human GTF at this link: http://useast.ensembl.org/info/data/ftp/index.html
It will need a bit of scripting to convert the GTF into a format that MuSiC likes.