How to run MISO with ENSEMBL GTF if reads were aligned against UCSC genome
2
2
Entering edit mode
9.6 years ago

Hi,

I aligned reads against UCSC genome using STAR, where UCSC gtf was used as reference annotation. However, I want to do quantitation of splicing event as per in ENSEML gtf file using MISO tool. In the homepage of MISO, it has been suggested to re-align the reads with the genome build prepared with ENSEMBL, in order to do so.

My question is that if only the chromosome naming convention is different between UCSC and ENSEMBL, then just by changing the chromosome name in ENSEMBL gtf file as per in UCSC should be sufficient. Why I should need to re-align the whole reads.

I will highly appreciate if someone could help me out.

RNA-Seq • 3.9k views
ADD COMMENT
0
Entering edit mode

I am facing the same issue using UCSC hg19, and ENSEMBL genome release 37 and STAR alignment. The format of the chromosomes in the BAM files is "chr1". I used both hg19_ensGene.gff3 and Homo_sapiens.GRCh37.65.gff with the same error above while using pe-utils function. Does removing ChrM from the BAM files help?

ADD REPLY
1
Entering edit mode
9.6 years ago
spiderdijon ▴ 20

The MISO documentation mentions that you can use any annotation, as long as you convert it to GFF3 format (http://miso.readthedocs.org/en/fastmiso/#answer10). You should make sure that the chromosome naming convention is the same as used in your BAM file. There should be no need to realign your sequence data.

ADD COMMENT
0
Entering edit mode

Thanks. For that, I just changed the chromosome names in ENSEMBL file as per the UCSC convention. However, its in MISO homepage, where it has been suggested to re-align (http://miso.readthedocs.org/en/fastmiso/#alternative-event-annotations). Why I don't understand.

ADD REPLY
0
Entering edit mode
9.6 years ago
bt27uk • 0

You didn't mention which version of the genome you are using. If it is hg19, then the difference you need to be aware of in terms of the genome content is the mitochondrial sequence (chrM in UCSC parlance). The Ensembl and Genbank GRCh37 release includes the mitochondrial reference, NC_012920. The version included in UCSC for hg19 is older, NC_001807. These are different. For example, the first is 16569bp and the latter is 16571bp.

If you are not working with the mitochondrial sequence or it can be ignored, and you are happy with your mapping, then I can't see a problem proceeding.

Warning: I am only commenting on the genome contents here. I have no knowledge of the MISO tool.

ADD COMMENT
0
Entering edit mode

Thanks for the explanation. I am using UCSC hg19, and ENSEMBL genome release 37

ADD REPLY

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6