I know there are a few posts here that raise specific questions about RNA-seq library prep. protocols, but I was curious if there's a comprehensive catalog about exactly what protocols exist and exactly what type of data they produce (i.e. to what do the reads in the final FASTA/FASTQ files correspond). Basically, I'm somewhat confused by all of the different aspects of a protocol and how they compose. For example, if a protocol is stranded or not, the relative orientation of the reads, which strands each read (i.e. \1 and \2) comes from. Are the reads that end up in a FASTA/Q file always reported 5' to 3' with respect to the strand from which they are derived? Are mates pointing toward each other reported with respect to the same strand, opposite strands, or both depending on the protocol? What do people mean when they say reads are 'reversed' --- does this mean reverse complemented with respect to the mate, or that the read is actually reported in the 3' to 5' direction (i.e. reversed but not complemented)?
Basically, I'm curious how all of these different variables interact with each other to produce the read sequences that will be used for downstream analysis. If I want to be able to communicate (to another person, or, perhaps as importantly, to a piece of software) all of the details / constrains about which reads should map to which strands in which orientations --- what is the most parsimonious way to do so? What is the minimum amount of information I need to convey? Is there a standard language / specification for representing this information? I'm sorry to ask such a broad question, but I'm a bit overwhelmed and trying to gain a comprehensive understanding of what, exactly, the reads in a file represent in light of the protocol under which they were prepared.