Need to systematically identify genes within some distance of a response element in the genome
1
0
Entering edit mode
5.3 years ago
rependo ▴ 40

I'm trying to find a program or script that can be used to systematically return a list of genes in the genome that fall within some specified distance of a response element.

I have two files to work with: 1.) A list of predicted response elements in the genome, identified using PoSSuM-search, that includes each elements genomic coordinates 2.) The annotation file from that same genome, that includes genomic coordinates for gene features

Ideally, I want to use the genomic coordinates for response elements in file (1) to pull out any gene present in the annotation file that falls within a pre-specified distance of a response element (e.g 100kb).

Thank you!

To moderators: this same question was crossposted on Researchgate https://www.researchgate.net/post/Recommended_programs_to_systematically_identify_genes_within_some_distance_of_a_response_element_in_the_genome

RNA-Seq genome gene • 1.0k views
ADD COMMENT
0
Entering edit mode

Have a look at bedtools slop to define windows around a set of coordinates (here these response elements) and bedtools intersect to intersect those with the genes. Can you give an example how the output should look like?

ADD REPLY
0
Entering edit mode

I'm figuring it out as I go, but ideally the output would be a .txt, with columns defined as:

1.) Gene (one per line): all genes located within 100kb of a response element identified in the input 2.) Strand 3.) Gene start position 4.) Gene strand 5.) Response element start position 6.) Response element strand

ADD REPLY
2
Entering edit mode
5.3 years ago

Via BEDOPS bedmap:

$ bedmap --range 100000 --echo --echo-map response-elements.bed genes.bed > answer.bed

If you don't have genes in BED format, but in GFF format, that's easy to fix:

$ bedmap --range 100000 --echo --echo-map response-elements.bed <(gff2bed < genes.gff) > answer.bed

Or, for GTF-formatted annotations:

$ bedmap --range 100000 --echo --echo-map response-elements.bed <(gtf2bed < genes.gtf) > answer.bed
ADD COMMENT
0
Entering edit mode

Awesome -- thank you, Alex.

ADD REPLY

Login before adding your answer.

Traffic: 2339 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6