How to create a BED12 file defining UTR sequences
0
0
Entering edit mode
3.2 years ago

Hello,

I am doing an experiment and I need to build a BED12 file for some UTR sequences that I have. I have done a blast for those sequences and with that I was able to build a successful BED6 file, like this:

19 20752377 20758767 ENSDARG00000062634_Kat2b_Tscan 0 +

15 20463447 20464774 ENSDARG00000002336_Dlc_seq01 0 -

However, the script I will use needs the information in the 11th and 12th fields of the BED12. So, I need to go from these UTR sequences to a BED12 file describing them.

My first approach was to look for these UTR coordinates (chrm, start, end) in the UCSC table broweser and download a BED12 file for these coordinates. However, I cannot get a BED12 file that is limited to the coordinates I used as input. It returns me the information for the whole gene/isoform or, if I limit it to the UTR, I cannot get all the fields I need.

E.G:

chr19 20724346 20755146 ENSDART00000090757.4 0 + 20724453 20752377 0 18 308,139,146,87,182,192,98,126,140,209,127,111,144,115,31,64,85,2963, 0,2755,3093,6826,7286,10788,11063,11242,14594,15533,17592,17815,19837,20058,21142,25091,25257,27837,

Does anyone have any suggestion on how to address this? I need this file to move on with my analysys but I am stuck here for a while now.

Thank you so much!

Gabriel

Sequences UTRs BED12 • 1.4k views
ADD COMMENT
0
Entering edit mode

Hi Baldissera, I just changed your "Forum" post to a regular question post. The "Forum" tag should be reserved for very open questions for which there is no unique answer.

ADD REPLY
0
Entering edit mode

Concerning your question, why would you need UTR formated in BED12 ? BED12 is used to draw exons boundaries in genes. Usually (unless perhaps if the UTR spam multiple exons), there is no need to further delimitate UTR in a genome browser. See this previous post about the use of BED12 files: The bed12 format

ADD REPLY
0
Entering edit mode

Hi Carlo, Thank you for mentioning those posts, I had gone through them. I am doing this experiment and I looked inside their code and their functions were written to work with a BED12 as they extract the information from the 11th and 12th fields of the file. I had not realized that and used my BED6 file in their script but it yields an error because they need to extract the information from those columns. So, I am now trying to make a BED12 file for these UTRs. I had considered creating these new columns manually but I am afraid of commiting a mistake.

ADD REPLY
0
Entering edit mode

I see, thank you for clarifying. In fact, I don't know what is a BED12 file for UTR alone, or even if it is a thing. In "normal" BED12 files, 3'UTRs are defined as the region between transcript end and CDS (between thickStart and thickEnd) end. So the information about UTR coordinates is comprised in the full gene BED12 entry you got from the UCSC table browser. Perhaps you could try to use that in RESA and see how it goes ? If the authors meant something else, then they might need to update their documentation because it is not very clear.

ADD REPLY

Login before adding your answer.

Traffic: 1317 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6