Question: Need to systematically identify genes within some distance of a response element in the genome
gravatar for rependo
8 days ago by
rependo0 wrote:

I'm trying to find a program or script that can be used to systematically return a list of genes in the genome that fall within some specified distance of a response element.

I have two files to work with: 1.) A list of predicted response elements in the genome, identified using PoSSuM-search, that includes each elements genomic coordinates 2.) The annotation file from that same genome, that includes genomic coordinates for gene features

Ideally, I want to use the genomic coordinates for response elements in file (1) to pull out any gene present in the annotation file that falls within a pre-specified distance of a response element (e.g 100kb).

Thank you!

To moderators: this same question was crossposted on Researchgate

rna-seq gene genome • 99 views
ADD COMMENTlink modified 8 days ago by Alex Reynolds27k • written 8 days ago by rependo0

Have a look at bedtools slop to define windows around a set of coordinates (here these response elements) and bedtools intersect to intersect those with the genes. Can you give an example how the output should look like?

ADD REPLYlink written 8 days ago by ATpoint12k

I'm figuring it out as I go, but ideally the output would be a .txt, with columns defined as:

1.) Gene (one per line): all genes located within 100kb of a response element identified in the input 2.) Strand 3.) Gene start position 4.) Gene strand 5.) Response element start position 6.) Response element strand

ADD REPLYlink written 7 days ago by rependo0
gravatar for Alex Reynolds
8 days ago by
Alex Reynolds27k
Seattle, WA USA
Alex Reynolds27k wrote:

Via BEDOPS bedmap:

$ bedmap --range 100000 --echo --echo-map response-elements.bed genes.bed > answer.bed

If you don't have genes in BED format, but in GFF format, that's easy to fix:

$ bedmap --range 100000 --echo --echo-map response-elements.bed <(gff2bed < genes.gff) > answer.bed

Or, for GTF-formatted annotations:

$ bedmap --range 100000 --echo --echo-map response-elements.bed <(gtf2bed < genes.gtf) > answer.bed
ADD COMMENTlink modified 7 days ago • written 8 days ago by Alex Reynolds27k

Awesome -- thank you, Alex.

ADD REPLYlink written 7 days ago by rependo0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1398 users visited in the last hour