Problem Fetching Sequences From A Bed File With Galaxy
1
0
Entering edit mode
11.1 years ago

I've made a .bed file for the ChIP-exo data for transcription factor Phd1 published by Rhee & Pugh (2011) Cell.

I'm trying to do something extremely simple: Fetch the corresponding sequences from the S. cerevisiae genome (June 2008) using the Galaxy server. The first 10 lines (out of 967) of the .bed file are:

chr1    8       102
chr1    20874   20968
chr1    190042  190136
chr1    190477  190571
chr1    191398  191492
chr1    230073  230167
chr2    165355  165449
chr2    165685  165779
chr2    376055  376149
chr2    376173  376267
chr2    378953  379047

When I feed this into Galaxy, it tells me "967 warnings, 1st is: Unable to fetch the sequence from '8' to '94' for chrom 'chr1'. Skipped 967 invalid lines, 1st is #1, "chr1 8 102"

I have no idea where this error is coming from. Why would it try to fetch a sequence from locations 8 to 94, when the first line specifies the locations 8 to 102? Bizarre.

Any ideas?

Thanks, Josh

bed galaxy • 3.2k views
ADD COMMENT
1
Entering edit mode

maybe the chromosome name is different from 'chr1' in the FASTA file?

ADD REPLY
0
Entering edit mode

good catch - though I must say that error message was as misleading as it gets, the OP should make sure that the intervals that are extracted do indeed span the right range!

ADD REPLY
0
Entering edit mode

you are right on that, I just wanted to see if OP made it sure ;). btw, is there a wiki or faq tag here? - so some sort of FAQ can be eventually produced

ADD REPLY
0
Entering edit mode

that is a good idea, I have beent thinking about the best way to create a series of posts that should be required reading and would solve recurring problems. Could be a single thread of posts tagged as faq.

ADD REPLY
0
Entering edit mode

Hi joshualevipayne:

I faced a similar problem. I want to consulte you, just 3 rows like you give can get the result? No need transcript id or gene id or something else?

ADD REPLY
0
Entering edit mode
11.1 years ago

Thanks seninp, that's it. The chromosomes should be specified with Roman numerals. All set now.

ADD COMMENT

Login before adding your answer.

Traffic: 2472 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6