I have genome seq PE reads of an Arabidopsis mutant. What I know of this mutant is that: a) it has a t-DNA insertion somewhere, containing a GUS transgene. b) it has at least another mutation, causing a phenotype I am interested in.
What could be a possible strategy to identify where the GUS transgene is placed, and also to find other possible insertions or deletions in the genome?
What I tried I mapped reads to the genome, and assembled contigs from unmapped reads. This way I could assemble a contig containing part of the t-dna insertion, but couldn't find the flanking sequence (so I don't know where it is). I could find this contig just by searching for the GUS sequence. But I also would like to find a way to detect other possible insertions of which I don't know any sequence, and I don't know how.
I also tried to look at all the reads that were mapped and had their mate unmapped. I thought I could detect this way the insertion, but I end up with lots of reads everywhere (I don't know if it is normal to have so many unmapped reads, or if I used wrong mapping parameters).