Using the biomaRt R package I was able to obtain the transcriptome of an organism as well as all its exons and UTR sequences. Now, using the Biostring package I managed to process these objects so that I can write them as a fasta file; however, I also need to get the intronic sequences and there is no way to do that directly from biomaRt.
Assuming that whatever is present in the transcriptome fasta file that is not present in the exons and UTR fasta files is an intron, is there a way I can extract that from the transcriptome fiasta file using the Biostrings library? I can work with either the fasta files I managed to write, or with the object as it comes as output from the getBM() function... so whatever is easier to work with I'll work with.
If there is no way I can do that... what do you guys think I should do?
*Should I learn and use bedtools instead? Can I use the ranges obtained from Biomart to extract them that way?
*Should I learn and use TxDb instead? I wonder if I can use the data as I have it right now.
*Should I be using base R instead? Something like gsub to mask or delete the exons and UTRs?
Thanks in advance! Any input is totally appreciated.