Question: Problem Fetching Sequences From A Bed File With Galaxy
0
gravatar for joshualevipayne
6.5 years ago by
Zurich
joshualevipayne70 wrote:

I've made a .bed file for the ChIP-exo data for transcription factor Phd1 published by Rhee & Pugh (2011) Cell.

I'm trying to do something extremely simple: Fetch the corresponding sequences from the S. cerevisiae genome (June 2008) using the Galaxy server. The first 10 lines (out of 967) of the .bed file are:

chr1    8       102
chr1    20874   20968
chr1    190042  190136
chr1    190477  190571
chr1    191398  191492
chr1    230073  230167
chr2    165355  165449
chr2    165685  165779
chr2    376055  376149
chr2    376173  376267
chr2    378953  379047

When I feed this into Galaxy, it tells me "967 warnings, 1st is: Unable to fetch the sequence from '8' to '94' for chrom 'chr1'. Skipped 967 invalid lines, 1st is #1, "chr1 8 102"

I have no idea where this error is coming from. Why would it try to fetch a sequence from locations 8 to 94, when the first line specifies the locations 8 to 102? Bizarre.

Any ideas?

Thanks, Josh

bed galaxy • 2.1k views
ADD COMMENTlink modified 5.3 years ago by orangehu80 • written 6.5 years ago by joshualevipayne70
1

maybe the chromosome name is different from 'chr1' in the FASTA file?

ADD REPLYlink written 6.5 years ago by Pavel Senin1.9k

good catch - though I must say that error message was as misleading as it gets, the OP should make sure that the intervals that are extracted do indeed span the right range!

ADD REPLYlink written 6.5 years ago by Istvan Albert ♦♦ 81k

you are right on that, I just wanted to see if OP made it sure ;). btw, is there a wiki or faq tag here? - so some sort of FAQ can be eventually produced

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by Pavel Senin1.9k

that is a good idea, I have beent thinking about the best way to create a series of posts that should be required reading and would solve recurring problems. Could be a single thread of posts tagged as faq.

ADD REPLYlink written 6.5 years ago by Istvan Albert ♦♦ 81k

Hi joshualevipayne:

  I faced a similar problem. I want to consulte you , just 3 rows like you give can get the result? No need transcript id or gene id or something else?

ADD REPLYlink written 5.3 years ago by orangehu80
0
gravatar for joshualevipayne
6.5 years ago by
Zurich
joshualevipayne70 wrote:

Thanks seninp, that's it. The chromosomes should be specified with Roman numerals. All set now.

ADD COMMENTlink written 6.5 years ago by joshualevipayne70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 560 users visited in the last hour