I was trying to map the RNA seq data on the assembled genome. I was using Tophat
tophat-2.0.10.Linux_x86_64 and Bowtie2
bowtie2-2.1.0. When I tried to input the index of repeat masked genome (Masking was done by RepeatModeler; Genome size 78Mb; Repeats: 42%). Tophat2 is giving me following error message:
bowtie2-inspect SCa_gtr_500_discarded_90_percent_Ns_ID_renamed.fasta.masked.indexassert_eq: expected (1816, 0x718) got (1536, 0x600)
bowtie2-inspect: bt2_inspect.cpp:218: void print_ref_sequences(std::ostream&, bool, const EList<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, 128>&, const uint32_t*, const std::string&): Assertion 0' failed.
But when I did the same thing on the unmasked genome, it is running fine now. My questions are,
a). Is this error is usual with all repeat-masked genomes?
b). I need to predict genes using Reference based RNA-Seq assembly, so should I really do reads mapping on the repeat-masked genome?
c). I am interested in finding genes on the repeat-masked genome, how can we fix this problem?
I would really appreciate your comments on this!
Best regards, Rahul