How to write code to extract e.g. 1kb upstream of desired sequence to obtain promoter sequences for analysis?
1
0
Entering edit mode
6.1 years ago
Monika515 • 0

I will be working with a specific human gene (let's call it X), for which I am to predict possible transcription factors. For this I want to work with the promoter sequences instead of the actual gene sequences. I am familiar with how to extract sequences in Biopython. I am rather new to coding and was wondering how would I be able to easily extract the promoters for each of the genes I obtain with a e.g. BLAST search to later e.g. do phylogenetic footprinting?

I am at a loss as to where to begin. I will happily take any advice and any guidance to useful resources from which I can learn. The Biopython tutorial/cookbook is rather useful but I have not been able to easily locate an answer to my question.

sequence gene • 2.3k views
ADD COMMENT
0
Entering edit mode

This post was deleted for some reason. @Monika515 - if Nicolas Rosewick's answer is useful, please accept their answer to mark the post as solved.

See: How to: Marki an answer as an accepted answer

ADD REPLY
1
Entering edit mode
6.1 years ago

If you have the gene id you can easily extract the upstream sequence using ENSEMBL Biomart : https://www.ensembl.org/biomart/martview

Choose ENSEMBL GENES / Human genes (if you work in human)

Then Filters > Gene > "Input external references ID list" . Here put your gene id of interest and choose the correct type of gene id (ENSEMBL id, gene symbol, etc...)

Then Attributes > Sequences > Flank (Gene)

Click on upstream flank and put the number of desired upstream bases to extract (e.g. 1000 for 1kb)

Click on Results button (top left, just under ENSEMBL logo)

ADD COMMENT
0
Entering edit mode

how to extract promoter regions around TSSs (−1500 to +500) using ENSEMBL Biomart?

ADD REPLY

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6