Although I'm very inexperienced with bioinformatics, what I"m trying to do is very straightforward. I want to align my Miseq mRNAseq reads to the mouse transcriptome.
Thus far, I've downloaded the Ensembl GCRm38dna.fa genome file and indexed it with bowtie2-build
I've also downlead the Ensembl GCRm38.85.GTF file for transcriptome annotation
To run tophat, I'm using the following command (default parameters):
tophat2 -G MusGRCm3885.gtf MusGRCm38dna 560RF.fastq
However, I'm getting the error:
I'm not quite sure what's going on. The computer I"m using has ~4 GB ram. Should I change the min length to <50, considering my mRNA snippets are ~30 bases?
[2016-09-16 13:05:51] Checking for Bowtie Bowtie version: 184.108.40.206 [2016-09-16 13:05:52] Checking for Bowtie index files (genome).. [2016-09-16 13:05:52] Checking for reference FASTA file [2016-09-16 13:05:52] Generating SAM header for MusGRCm38dna [2016-09-16 13:07:06] Reading known junctions from GTF file [2016-09-16 13:07:33] Preparing reads left reads: min. length=50, max. length=50, 20782969 kept reads (99 discarded) [2016-09-16 13:12:10] Building transcriptome data files ./tophat_out/tmp/MusGRCm3885 [2016-09-16 13:13:41] Building Bowtie index from MusGRCm3885.fa [2016-09-16 13:30:36] Mapping left_kept_reads to transcriptome MusGRCm3885 with Bowtie2 [2016-09-16 13:47:27] Resuming TopHat pipeline with unmapped reads [2016-09-16 13:47:27] Mapping left_kept_reads.m2g_um to genome MusGRCm38dna with Bowtie2 [2016-09-16 14:33:35] Mapping left_kept_reads.m2g_um_seg1 to genome MusGRCm38dna with Bowtie2 (1/2) [2016-09-16 15:30:31] Mapping left_kept_reads.m2g_um_seg2 to genome MusGRCm38dna with Bowtie2 (2/2) [2016-09-16 15:55:52] Searching for junctions via segment mapping Coverage-search algorithm is turned on, making this step very slow Please try running TopHat again with the option (--no-coverage-search) if this step takes too much time or memory. [FAILED] Error: segment-based junction search failed with err =-9 found 0 potential small insertions