I am looking into using Salmon, but I need to quantify immature transcripts. Salmon examples I have seen all use a transcripts.fa file made of mature transcripts. I was thinking to make a custom transcripts.fa file that includes unspliced transcripts. Is there any reason why this would be a bad idea? Is it likely to be accurate for for very long, but relatively low-abundance molecules, such as unspliced/partially spliced transcripts?
Sounds like an OK plan to me and I don't see any immediate issues with it.
The only thing might be that you include some less-unique sequence (eg. repeats/TEs in introns) in the transcripts by including the introns but that should not be too much of an issue anyway.
Tip: simply run
bedtools with a bed-file of your genic regions on the genome and you will get the desired transcript sequences.