Question: Building A Splice Junction Library For Aligning Mrna Against The Rat Genome: Chrx_Random.Fa Problem
gravatar for Jimbo
8.7 years ago by
Jimbo120 wrote:


I am mapping RNAseq reads to the rat genome.

I am building

For each chromosome there is a chrX_random.fa fasta file with it.

Should I ignore these when building the splice junctions libraries? It seems no genes map to the chrX_random.fa files anyway, according to the ENSEMBL annotation i got from UCSC (though I might be wrong about this?).

I realise I should still keep them for aligning reads against.

I am following these instructions:

Many thanks.

Also: I am not using e.g. tophat because my reads are 34bp, and tophat states explicitly in the manual "The software is optimized for reads 75bp or longer."

Plus I am not sure if I like the idea of tophat realigning the orginally unmapped reads in a "second round" surely this is problematic with shorter reads, since they are more likely to map ambiguously:

Wouldn't it be better to align against a genome plus junctions in the same round, to give the junctions "equal chance" of being mapped to as genomic regions, esp. with short reads which could easily map erroneously to pseudo-genes more easily than might be the case with longer reads, or paired end reads.

genome rna • 2.9k views
ADD COMMENTlink modified 6.8 years ago by Biostar ♦♦ 20 • written 8.7 years ago by Jimbo120
gravatar for Malachi Griffith
8.4 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith18k wrote:

Yes, for such reads it may be a good idea to use a junction database. You can download a pre-computed, exon-exon junction database here: ALEXA-seq

ADD COMMENTlink written 8.4 years ago by Malachi Griffith18k

I want to make an exon junction bed file for CAST mouse strain. I can't get ALEXA-seq to work because of an unknown host error with cvs to ensembl. I imagine because it's quite old, do you happen to have a new version of the alternativeExpressionDatabase/ program that's updated? Also perl is legit :)

ADD REPLYlink written 16 months ago by QVINTVS_FABIVS_MAXIMVS2.4k
gravatar for Dm Church
8.3 years ago by
Dm Church30
United States
Dm Church30 wrote:

In many cases, the chr*_random sequences do contain annotation. This is even true in human and mouse (two well curated, high quality assemblies). These are just sequences that can't be ordered and oriented on the assembly, but they may still contain features.

ADD COMMENTlink written 8.3 years ago by Dm Church30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 985 users visited in the last hour