I've been trying to map my RNAseq results onto a single gene (e.g. GFP) rather than an entire genome, and I've encountered a problem with junctions/splices.
[2015-03-24 19:56:39] Beginning TopHat run (v2.0.10)
-----------------------------------------------
[2015-03-24 19:56:39] Checking for Bowtie
Bowtie version: 2.1.0.0
[2015-03-24 19:56:39] Checking for Samtools
Samtools version: 0.1.19.0
[2015-03-24 19:56:39] Checking for Bowtie index files (genome)..
[2015-03-24 19:56:39] Checking for reference FASTA file
Warning: Could not find FASTA file ./fly_genome/GFP.fa
[2015-03-24 19:56:39] Reconstituting reference FASTA file from Bowtie index
Executing: /Users/michael_song/Bioinformatics//bowtie2-2.1.0/bowtie2-inspect ./fly_genome/GFP > mapping_260_GFP/tmp/GFP.fa
[2015-03-24 19:56:39] Generating SAM header for ./fly_genome/GFP
[2015-03-24 19:56:39] Preparing reads
left reads: min. length=100, max. length=100, 44560810 kept reads (170651 discarded)
right reads: min. length=100, max. length=100, 44724513 kept reads (6948 discarded)
[2015-03-24 20:46:52] Mapping left_kept_reads to genome GFP with Bowtie2
[2015-03-24 21:31:56] Mapping left_kept_reads_seg1 to genome GFP with Bowtie2 (1/4)
[2015-03-24 21:34:54] Mapping left_kept_reads_seg2 to genome GFP with Bowtie2 (2/4)
[2015-03-24 21:38:16] Mapping left_kept_reads_seg3 to genome GFP with Bowtie2 (3/4)
[2015-03-24 21:41:30] Mapping left_kept_reads_seg4 to genome GFP with Bowtie2 (4/4)
[2015-03-24 21:44:48] Mapping right_kept_reads to genome GFP with Bowtie2
[2015-03-24 22:24:16] Mapping right_kept_reads_seg1 to genome GFP with Bowtie2 (1/4)
[2015-03-24 22:26:52] Mapping right_kept_reads_seg2 to genome GFP with Bowtie2 (2/4)
[2015-03-24 22:29:59] Mapping right_kept_reads_seg3 to genome GFP with Bowtie2 (3/4)
[2015-03-24 22:33:05] Mapping right_kept_reads_seg4 to genome GFP with Bowtie2 (4/4)
[2015-03-24 22:36:18] Searching for junctions via segment mapping
**Warning: junction database is empty!**
[2015-03-24 22:48:56] Retrieving sequences for splices
[2015-03-24 22:48:56] Indexing splices
**[FAILED]
Error: Splice sequence indexing failed with err =1**
The weird thing is that I only get this error with one of four similar libraries... I've already mapped all four libraries to a complete genome (D. melanogaster), so I'm puzzled why I'm getting this error with only one library.
I would be sincerely grateful if anyone in the community could share their thoughts about this!
Hi Devon, thanks for your response! I've tried running the analysis with the latest version of tophat2 (2.0.14), and I still get the same result...any ideas what I can troubleshoot?
You might try a different aligner (STAR, HISAT, etc.). Presumably there simply are no splice junctions in this dataset in that's killing tophat2. I would strongly recommend looking at the results in IGV or a similar browser, since it's quite likely that the sample that's causing this should be excluded.