Music Roi File
1
2
Entering edit mode
11.2 years ago
NickA5133 ▴ 30

Hey,

I am an undergraduate student currently working in a lab at Penn State. I am having some difficulty using the Genome MuSiC tool. While I was able to successfully use the mutation-relation function, I am having difficulty using the bmr calc-covg function. I am currently using the TCGA database to download mutation data. Do I need to build the ROI file myself using data obtained from the TCGA database or am I able to access that information directly.

Thank you

music genome • 4.1k views
ADD COMMENT
5
Entering edit mode
11.1 years ago

The ROI file used by bmr calc-covg and bmr calc-bmr describe your regions of interest. This is typically all protein coding regions, splice sites, and non-coding RNA genes. But you can potentially get more creative with your definition of a "gene", to test for significantly altered protein domains, pathways, or gene sets. This tutorial - Installation of the MuSiC suite on unsupported linux distributions - lists some sample ROI files that are based on Ensembl 67 (Gencode 12). From the two options, choose the one that matches the reference build used by your downloaded TCGA MAF file (See column 4, per MAF format).

One caveat is that bmr calc-bmr expects the gene names in the ROI file to match those used in column 1 of the MAF file. TCGA MAF files don't always use a consistent annotation database, and will likely not correspond to Ensembl 67 gene names and loci. But you can overlook this issue for well-studied genes, whose names and loci haven't changed in a while.

ADD COMMENT
0
Entering edit mode

Thank you for the quick response! That information helps me out a lot!

What purpose does the reference-sequence parameter serve and what sequence(s) should be used for this? Is this the sequence of the reference build (i.e. hg19)?

Thanks for all the help!

ADD REPLY
1
Entering edit mode

Yea. The reference sequence should match whatever you're using in all the other files. Column 4 of TCGA MAF files tell you whether they are in hg18 (build36) or hg19 (build37). Your BAM or WIG files, MAF files, and ROI files should all be based on the same reference sequence. MuSiC uses the reference sequence to determine the sequence context of variants in the MAF file. For example, if they lie at CpG sites or AT or CG regions.

ADD REPLY

Login before adding your answer.

Traffic: 1950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6