Finding the read ratio of transcription termination data using R
1
0
Entering edit mode
5 weeks ago
margo ▴ 20

How would I write a code which finds >1.1 read ratio of -1 site (predicted transcription termination site) to +1 site (1 nt downstream transcription termination site) ? I have an annotated file within a dataframe in R which contains the following headings ‘seqnames’, ‘start’, ‘end’, ‘strand’, ‘predicted_term’. I also have count files in the form of bedgraph files for the positive and negative strand which look like this:

I am trying to write a code which will scan through my bedgraph files and count transcription termination site of each gene. I have written the following code so far:

count_TTS <- function(x) {
for (i in 1:nrow(df)){
if (df[ i , "strand" ]=="-"){
TES_start = df[ i , "start" ]
tmp=n_bed
}else{
TES_start = df[ i , "end" ]
tmp=p_bed
}
TESrange=c(TES_start: df[ i , "predicted_term" ])
TESrange = sort(TESrange)
tmp = tmp[ TESrange , ]
tmp = tmp[!is.na(tmp\$V2),]


Yet I am wanting to add to this code to find >1.1 read ratio of -1 site (predicted transcription termination site) to +1 site (1 nt downstream transcription termination site). Can anyone help?

ratio gtf igv R • 162 views
0
Entering edit mode
5 weeks ago
Trivas ▴ 440

Please do not use bedgraph files for this type of analysis. I've used bed files and bedtools coverage to calculate the coverage within specific regions, then convert those numbers to RPK, then finally calculate your ratio. Furthermore, if you're looking at read-through transcription/termination defects, you are better off looking at different windows (see Fig 3 here https://journals.asm.org/doi/full/10.1128/MCB.00181-18).

Finally, even if I'm totally misinterpreting your question, we cannot troubleshoot your code very easily. Please post text versions of the results of head so we can see what all of your tables look like including column names.

0
Entering edit mode

Thank you for getting back to me. I have converted my bedgraph file to a dataframe. I am wanting to write a code which applies the function to each row. Is there a way to implement a code within the function I have written above to calculate the ratio? The p_bed and n_bed are the names for my positive strand and negative strand count dataframes.