How do I identify and differentiate between unidirectional and bidirectional promoters
1
0
Entering edit mode
7.1 years ago
cbio ▴ 450

I have a set of genes that contain a protein of interest at the TSS. I would like to be able to separate these genes into two classes: genes with a unidirectional promoter, and genes with a bidirectional promoter.

I have access to pair-end GRO-Seq data, but no RNA-seq data. Is there a way to do this?

ChIP-Seq GRO-seq next-gen bidirectional promoters • 2.2k views
0
Entering edit mode

technically, do you wan to get the 5' reads that go in opposite directions but overlap with each other ( or present with in certain distance, lets say 400bp ?) Like that of enhancerRNAs which transcribe bi directionally ?

0
Entering edit mode

Yes this is what I'd like to do. I had previously thought I could simply look for overlapping regions of gro-seq neg/pos coverage bedgraphs 1k from annotated TSS's using bedtools, but this did not work.

0
Entering edit mode

Do you have a separate files for 5' reads ? When you say paired end data, do you know which reads are originated from 5' of a transcript ?

0
Entering edit mode

I do not have a separate file for these. What I have currently is a bedtools genomecoverage bedgraph that contains the entire coverage, and is not limited to the -5' option that I generated using:

genomeCoverageBed -bg -strand + -ibam $infile -g$genome > outdir/genomecoveragebed/$outfile3 genomeCoverageBed -bg -strand - -ibam$infile -g $genome | awk -F '\t' -v OFS='\t' '{$4 = - $4 ; print$0 }'> $outdir/genomecoveragebed/$outfile4


I'm very new to this GRO-Seq, and the data wasn't generated by my lab so getting information about it's generation has been difficult at best.

1
Entering edit mode

If you have paired-End data, somehow you need to separate reads that originated from 5' end. Otherwise you will not be able to find out exactly bidirectional transcripts. Anyway, if you would like to check which of the regions from forward strand are close to regions on reverse strand, you could use the closestBed feature.

closestBed -a Fw_strand.bed -b Rv_strand.bed -d | awk -v OFS="\t" '{ if ($NF<=400) print$1, $2,$3}' | sort -k1,1 -k2,2n | uniq | wc -l


But this won't be exclusive to bidirectional transcripts. Infact, it does not meaningful at all as, in general, paired-end reads maps in fr or rf orientation , so you will definitely end up with may regions that are close to each other on Fw and reverse strand.

Ask the people who generated the data, if they can tell you how to separate reads originated from 5' ends. Then I can tell you how to get bidirectional transcripts.

0
Entering edit mode
7.1 years ago
ivivek_ngs ★ 5.2k

I believe when you extract the list of genes from your data you have the strand specificity right? so then you will be able to understand which genes correspond to which strand be it + or - thus giving you strand specific feature. Then you can grep your output based on strand features.

This will give you two lists of promoters that have either + or - strandedness. Once you have it when you can overlap the genes to see bidirectional genes , since those which will overlap at refeseqIDs or gene symbols should be shared at the level of both strands. I believe this will help.