Transcription Start Site (Promoter Sequence) Tab Delimited Files Or Csv Files
2
0
Entering edit mode
10.4 years ago
akz99 • 0

Does anyone have tab delimited files or csv files of transcription start sites labeled by chromosome and with the gene names on each chromosome. Writing a program and found resources online of promoter database but a file would be much more helpful.

transcription promoter • 3.4k views
ADD COMMENT
1
Entering edit mode
10.4 years ago

My Biostars answer Is There An Easy Way Of Getting Gene Symbols From Genomic Coordinates? gives txStart and txEnd values for UCSC genes for hg18 and the specified range. You could modify this query for your build/organism of interest (e.g., hg19), your range of interest, or to look for TSSs for RefSeq or other gene tables.

Instead of running the command and looking at the results on standard output as my example showed, just redirect the output to a file:

$ mysql --user=genome ... | sort-bed - > result.bed

The file result.bed is a BED-formatted, tab-delimited text file, because I put the fields into that order in my query. Coordinates may not be guaranteed to be sorted, so I pass it through BEDOPS sort-bed to be sure.

A BED file is a tab-delimited text file that is described in more detail on UCSC's web site.

ADD COMMENT
0
Entering edit mode

I'm not familiar with the UCSC system, would it be possible to post the specific table I would look in and command for finding TSSs? Thanks very much.

ADD REPLY
0
Entering edit mode

It is in the answer I linked to (take a look at variables like kg.txStart etc. to see what table those refer to). Start there.

ADD REPLY
0
Entering edit mode
10.4 years ago

That information should be available in a UCSC Table:

http://genome.ucsc.edu/cgi-bin/hgTables?command=start

For example, I think the bed files for RefSeq, etc including coding coordinates as well as transcription start and stop sites.

ADD COMMENT

Login before adding your answer.

Traffic: 2020 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6