Hi,
My aim was to use Term-Seq to map RNA 3' ends across the genome, and then standard RNASeq to measure the drop in depth across these 3' sites as a measure of transcription read-through at terminators
For the Term-Seq library prep:
- RNA adapter is ligated to RNA 3' ends
- Reverse transcription
- Sequencing (read passes through RNA 3' end)
As far as I know, the RNASeq was done with conventional Illumina paired end, stranded methods
I am finding that the 3' locations identified by Term-Seq are consistently ~38 bp downstream of apparent the RNASeq 3' locations. This is approximately half the Term-Seq read length (75 bp), not sure if that's a coincidence. RNASeq is PE150. I can't think of any reason this should be. I think I've ruled out anything relating to the computational workflow, as I checked the sequence of the original unprocessed reads matches the predicted 3' location for Term-Seq and RNASeq. I can only presume it is something technical in the library prep, although unfortunately as this was done by a company, methodological detail is a bit lacking.
Does anyone know of anything that might cause this?
Any help much appreciated
Additional edit: I have experimentally verified a couple of the Term-Seq 3' ends by in vitro transcription, so I would guess it's the Term-Seq ends that are correct
Without knowing the details of the Term-seq method, could the 38 nt shift be accounted for by the difference in the position of the ribosome and the length of the terminator loop? Essentially, what's the stem loop length? Also, it's curious that this paper lists the median distance between the closest upstream CDS and the intrinsic terminators to be 38 nt for certain organisms. What's your bacteria? And are you sure you're differentiating 3' and 5' ends correctly from your alignments?
Hi thanks for your answer
Interesting idea about ribosome length, although I can't think why this should affect the apparent 3' position by sequencing. Are you thinking the ribosome itself or the terminator loop is blocking RNASeq coverage reaching the end of the transcript? Or affecting exonuclease activity at the 3' end? Why should this affect 3' ends determined by one method but not the other?
Although not performed on the same biological samples, the extraction method was the same including phenol which I guess should disrupt any ribosomes.
Unfortunately my bacterial species wasn't covered by that paper
RE 5' vs 3' ends - I think I'm getting it right. According to the notes from the company that did the sequencing, the Term-Seq read should map to the opposite strand from the transcript, and the 5' end of the read corresponds to the transcript 3' end. I confirmed for some highly expressed genes that Term-Seq coverage is found on the correct (opposite) strand from the template. I then assumed for plus strand transcripts (Rev strand Term-Seq reads), that the transcript 3' end is at the downstream boundary of the Term-Seq peak coverage. Conversely for the minus strand transcripts (Fwd strand Term-Seq reads), that the transcript 3' end is at the downstream boundary of the Term-Seq peak coverage (downstream relative to the transcript that is). Hope that makes sense... see below