Hello
I have two sets of reads that were obtained from RNA bulks of resistant and susceptible wheat recombinant inbred lines. I am looking for polymorphism between the bulks. Based on previous results, I suspect that the polymorphism associated with the resistance is a large insert of 1-5kb (up to the whole gene). I also suspect that this insert is not present in the reference genome. I am looking for a software that can detect large structural variations using de-novo assembly.
Can you please recommend which software to use?
Thank you
I don't think RNA-seq is the most appropriate technology for this, but now that the data is already generated we better make the best out of this.
I'll assume you don't have a reference genome since you want to go for de novo assembly?
So you expect an insertion of an entire new gene? That gene should be expressed before you will see anything in the transcriptome, right?
Thank you for the reply. I expect an insertion of an entire new gene, I know that resistance genes have in many cases presence/absence variation. I expect that the gene will be expressed because the resistance was active at the time that the samples were taken. However, the expression levels may be low. I do have a refernce genome and I have conducted SNP and short indel discovery with this data. This got us closer to the target but looking at the refernce genome we found no candidate genes.
An entire gene wouldn't map to the reference genome, I would try a de novo assembly on the unmapped reads.
Note that I haven't done anything like this before - it's my first intuition ;)
Yes, this is what I am looking for, de-novo assembly and comparison of long sequences.
For RNA-seq the most commonly used de novo assembler is trinity
what do you mean with "looking at the reference genome we found no candidate genes"? did you see a large insertion or not? and why do you hope to see more relevant information from the transcriptome data?