Question: from txt to bed
0
gravatar for dimitrischat
3.2 years ago by
dimitrischat110
dimitrischat110 wrote:

hello again. I downloaded from GEO some files in .txt format chrx:11111-2222 and mm8 genome version. I opened the txt file with excel and copy all the document - all the chrx's, then i used Ucsc liftover and pasted it in the box and also changing it from mm8 to mm9. then i get bed file but it is again in this format chrx:11111-2222. i know that in bed format they have to be separated. how do i change that ( new ) bed file now to usable bed one ? i hope i make some sense..

chip-seq • 2.9k views
ADD COMMENTlink modified 3.2 years ago by Alex Reynolds30k • written 3.2 years ago by dimitrischat110

When you download the file from UCSC it should already be in tab separated BED format. Are you not able to use the file as is?

If you did something with it in excel then make sure you save it as "tab delimited text" format.

ADD REPLYlink written 3.2 years ago by genomax84k

I downloaded from GEO

The GEO supplementary data comes in a multitude of formats.

ADD REPLYlink written 3.2 years ago by A. Domingues2.2k
7
gravatar for A. Domingues
3.2 years ago by
A. Domingues2.2k
Dresden, Germany
A. Domingues2.2k wrote:

You got plenty of things mixed up:

  1. mm8 and mm9 are not formats. These are genome version, specifically of Mus musculus.
  2. the format chrx:11111-2222 is not BED, so you will need to convert that to chrx 11111 2222. I assume you don't know how to use the command-line to do this? If you don't, use the galaxy tool Convert delimiters to TAB.
  3. I am assuming nothing gets converted from mm8 -> mm9 because the file format is not correct, but I am not sure. Anyway, convert the coordinates to bed first, and then do the mm8 -> mm9 conversion.

Edit:

Since you are learning how to use the command-line, say your file is file.txt.gz:

## test
echo chrx:11111-2222 | sed 's/:/\t/' | sed 's/-/\t/'

# gz file
zcat file.txt.gz | sed 's/:/\t/g' | sed 's/-/\t/g' > file.bed

# uncompressed file
cat file.txt | sed 's/:/\t/g' | sed 's/-/\t/g' > file.bed

That should work.


Edit2: apparently OSX (and other shells) has different ideas when it comes to sed. See comments for from StackOverflow solutions.


Another note: please format your question, it is very hard to read and understand. If you make our job hard, that is the people helping, you are less likely to get an answer.

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by A. Domingues2.2k

1.yes its genome versions. i know, wrong usage of word format. Yea i download .txt.gz files but in the ucsc liftover you can insert chrz:1111-2222 by pasting all the chrx's ( i think ). 2. i know how to use terminal, command line ( now starting to learn ). is there a command for this ? 3. maybe i am not sure about that also.

ADD REPLYlink written 3.2 years ago by dimitrischat110
1

See my edited answer. The answer assumes access to a Unix system (OSX or Linux).

ADD REPLYlink written 3.2 years ago by A. Domingues2.2k

now i get from this : chr1:4842133-4842148 - > this : chr1t4842133t4842148. chr start stop should be in separated columns

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by dimitrischat110
1

Depending on your system, one of these solutions should work.

ADD REPLYlink written 3.2 years ago by A. Domingues2.2k

thanks a lot! much appreciated !!

ADD REPLYlink written 3.2 years ago by dimitrischat110
2

If this solution has solved your problem then go ahead and accept it (green check mark) to provide closure for this thread.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by genomax84k
3
gravatar for Alex Reynolds
3.2 years ago by
Alex Reynolds30k
Seattle, WA USA
Alex Reynolds30k wrote:

Using sed is problematic because it isn't portable between GNU and BSD versions. You might use awk instead:

$ awk -F"[:-]" 'BEGIN{ OFS="\t"; }{ print $1, $2, $3; }' in.txt > out.bed

For example:

$ echo chrx:11111-2222 | awk -F"[:-]" 'BEGIN{ OFS="\t"; }{ print $1, $2, $3; }'
chrx    11111   2222
ADD COMMENTlink written 3.2 years ago by Alex Reynolds30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2183 users visited in the last hour