3.8 years ago by
re mapping genomic DNA to transcriptome: I would try LAST http://last.cbrc.jp/ because with any read spanning the intron-exon border more mainstream mappers I believe will reject the mapping because of the mismatch (intronic sequence from the read vs next exon in your transcript). LAST, given reasonably long exon-exon match should accept it and truncate your read. I have not done it myself in this exact scenario, but mapped RNASeq with trans-splicing leader to a genome with LAST. Close enough I hope.
re mapping to close genome: in a typical scenario the mapper choice is crucial. You need something being able to accept/report mappings with higher mismatch rates, but still not going overboard and placing almost every read anywhere. Check out again LAST and GEM http://algorithms.cnag.cat/wiki/The_GEM_library
Also because the taxonomies are still not based on sequence similarity, I would go and get all available (just 5) genomes from the same PACMAD clade:
Only maize genome is of comparable size to Arundo, I think. Pick the one FASTQ with the best quality values from your data set, map to all 5 genomes with at least 2 mappers listed above. Assuming you can get the soft masked genome sequences for these 5 genomes, repeat. Hopefully, you will map more than 2% of your reads, but obviously I can not guarantee it.
very long shot (very drafty genome assembly): if you got 5x for each of your individual mutants, and mutations are rare, you may pull all this data together, preferably after getting some $$$ for a PacBio of the unmutated strain, and see what comes out of this. Even if just a shattered mitochondrial and plastid sequences plus a big swarm of pathetically sized contigs, you can map back your individual samples to this, and maybe get some idea about differences in the coverage. Then cluster your mutants based on this (like: sy 0.5M contigs RPKMs /sample ), and check if there are any patterns (assuming deletions).
Hope it helps.