MapSplice2 Error: Bowtie Index not consistent with Reference Sequence
0
0
Entering edit mode
5.0 years ago
mh2354 • 0

I am attempting to use MapSplice2 for circRNA detection in mice. I downloaded the dna sequences for each chromosome from Ensembl and built my Bowtie index from the Ensembl primary assembly file. I receive this Error: Reference name in Bowtie Index contains space: '1 dna:chromosome chromosome:GRCm38:1:1:195471971:1 REF' contains space

So, I used sed to replace all space characters " " with underscores "_" for the headers in the chromosome fasta files. I do the same for the primary fasta assembly. I then rebuilt my Bowtie indices with the space-free fasta files and receive this Error: Bowtie Index not consistent with Reference Sequence '1_dna:chromosome_chromosome:GRCm38:1:1:195471971:1_REF' does not exist in Reference Sequence Error: Bowtie Index not consistent with Reference Sequence '2_dna:chromosome_chromosome:GRCm38:2:1:182113224:1_REF' does not exist in Reference Sequence Error: Bowtie Index not consistent with Reference Sequence '3_dna:chromosome_chromosome:GRCm38:3:1:160039680:1_REF' does not exist in Reference Sequence Error: Bowtie Index not consistent with Reference Sequence '4_dna:chromosome_chromosome:GRCm38:4:1:156508116:1_REF' does not exist in Reference Sequence

So I go to my reference sequence to look at the headers. Here they are: ==> 1.fasta <==

1_dna:chromosome_chromosome:GRCm38:1:1:195471971:1_REF

==> 2.fasta <==

2_dna:chromosome_chromosome:GRCm38:2:1:182113224:1_REF

==> 3.fasta <==

3_dna:chromosome_chromosome:GRCm38:3:1:160039680:1_REF

==> 4.fasta <==

4_dna:chromosome_chromosome:GRCm38:4:1:156508116:1_REF

So, I've proceeded in a straightforward manner trying to run MapSplice. My Bowtie indices are created from the primary assembly file from Ensembl. The chromosomal reference sequences are also downloaded from Ensembl. This resulted in the 1st error running MapSplice, above. So I edit all of the headers in the chromosomal sequences and in the primary assembly to replace spaces with underscores, then I build Bowtie indices from the edited primary assembly reference sequence. That leads to the second error.

MapSplice2 detailed manual has been down for days. Is there a problem in the MapSplice code when it checks for similarity between Bowtie indices and reference sequences? In both cases I am using Bowtie indices created from the reference that I use in my MapSplice command.

Is there an example of what they mean by "Bowtie Index not consistent with Reference Sequence"? My indices are literally directly built using the bowtie-build command on the primary assembly version, which is a concatenated file featuring all chromosomes and scaffolds. Do I need to run bowtie-build on each chromosome's sequence, independently?

Here is my command-line: Mapsplice="/l/Yu/YuLab/Bioinformatics/projects/mhills_circRNA-ADAR/MapSplice-v2.2.1" data="/l/Yu/YuLab/Bioinformatics/projects/mhills_circRNA-ADAR/data" treatment="tetG2" index="ACAGTG" output="/l/Yu/YuLab/Bioinformatics/projects/mhills_circRNA-ADAR/00_Alignment/MapSplice"

Run MapSplice-v2.2.1

python $Mapsplice/mapsplice.py -1 $data/$treatment/all_$index.fq \ -c /l/Yu/YuLab/Bioinformatics/projects/mhills_circRNA-ADAR/mm10/chromosomes \ -x /l/Yu/YuLab/Bioinformatics/projects/mhills_circRNA-ADAR/00_Alignment/Bowtie_1/bt_mm10 \ -p 8 \ -o $output/$index/alignment \ --min-fusion-distance 200 \ --gene-gtf /l/Yu/YuLab/Bioinformatics/projects/mhills_circRNA-ADAR/mm10/mm10.ensembl.gtf \ --fusion > $output/$index/MapSplice.out 2> $output/$index/MapSplice.err

RNA-Seq alignment • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 2592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6