Question

how can I transfer this different steps bedgraph format into 1bp file?

1

Entering edit mode

5.8 years ago

1106518271 ▴ 60

Can I transfer this first format to the second one just by basic shell procession or awk or sed on linux? This is a toy example:

This kind of text file is what I have, three cols, col2 and col3 like range, left close and right open,
chr1 0 2 0
chr1 2 6 1.5
chr2 0 3 0
chr2 3 10 2.1

Transfer to describe each position as:
chr1 0 0
chr1 1 0
chr1 2 1.5
chr1 3 1.5
chr1 4 1.5
chr1 5 1.5
chr2 0 0
chr2 1 0
chr2 2 0
chr2 3 2.1
...
chr2 9 2.1

Someone has idea how to solve? Thanks!!!

atac-seq linux • 1.9k views

ADD COMMENT • link updated 5.8 years ago by cpad0112 21k • written 5.8 years ago by 1106518271 ▴ 60

0

Entering edit mode

Hello 1106518271!

It appears that your post has been cross-posted to another site: https://stackoverflow.com/questions/51259160

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY • link 5.8 years ago by zx8754 11k

0

Entering edit mode

Sorry, I wait a quite long time, and thought I wouldn't get answer on that site. But I will surely notice you kind reminder.

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

score 4 · Accepted Answer · 2018-07-10

4

Entering edit mode

5.8 years ago

Pierre Lindenbaum 161k

awk '{B=int($2);E=int($3);for(i=B;i<E;++i) printf("%s\t%d\t%s\n",$1,i,$4);}' in.bed

ADD COMMENT • link 5.8 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

It's already running out，exactly correct!

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

score 4 · Accepted Answer · 2018-07-10

4

Entering edit mode

5.8 years ago

cpad0112 21k

Try:

awk -F "\t" -v OFS="\t" '{while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt

or

awk -F "\t" -v OFS="\t" '{for ($2=$2; $2<$3;$2++) {print $1,$2,$4}}' test.txt

output:

chr1    0   0
chr1    1   0
chr1    2   1.5
chr1    3   1.5
chr1    4   1.5
chr1    5   1.5
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1

ADD COMMENT • link 5.8 years ago by cpad0112 21k

0

Entering edit mode

Thanks for your flexible ideas!
If you are interest, could I ask that for this format (since if I tranfer whole file to 1bp will too large according the result),

chr1 0 2 0 
chr1 2 6 1.5 
chr2 0 3 0 
chr2 3 10 2.1
chr2 11 13 1.5

to extract lines meet chr2 1-7, it will be:

chr2 0 3 0 
chr2 3 10 2.1

Can I use like awk '/chr2/'|awk -F':' '**/20080501[2-9]/**' file

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

1

Entering edit mode

To get an ad-hoc subset, you can use set operations in a Unix pipeline:

$ echo -e 'chr2\t1\t7' | bedops -e 1 foo.bed - | bedops --chop 1 - | bedmap --faster --echo --echo-map-id --delim "\t" - foo.bed | cut -f1,2,4 > answer.txt

The --faster option can be used here, as none of your example elements are nested.

ADD REPLY • link 5.8 years ago by Alex Reynolds 35k

0

Entering edit mode

Thanks, seems well worth learning bedops!

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

1

Entering edit mode

@OP: You can shorten code:

$ awk -F "\t" -v OFS="\t" '/chr2/' test.txt
chr2    0   3   0
chr2    3   10  2.1

However, subsetting can be done in line as below:

$ awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1

ADD REPLY • link 5.8 years ago by cpad0112 21k

1

Entering edit mode

awk -F "\t" -v OFS="\t" '{while ($2 < $3) { print $1,$2,$4 > $1".txt"; ++$2 }}' test.txt

This would create one file per chromosome (chr1.txt, chr2.txt) and each file will have corresponding data.

ADD REPLY • link 5.8 years ago by cpad0112 21k

0

Entering edit mode

For this, I tried for my large file contains chr1-chr22 and chrX chrY, these commands will also output chr22 line, not just chr2

$ awk -F "\t" -v OFS="\t" '/chr2/' test.txt

awk '/chr2/'

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

0

Entering edit mode

try awk -F "\t" -v OFS="\t" '$1 ~ /^chr2$/' test.txt or awk '/chr2\t/' test.txt

ADD REPLY • link 5.8 years ago by cpad0112 21k

0

Entering edit mode

if did that, no output now

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

1

Entering edit mode

$ awk -F "\t" -v OFS="\t" '$1 ~ /^chr2$/' test.txt
chr2    0   3   0
chr2    3   10  2.1

another method:

$ awk '/chr2\t/' test.txt
chr2    0   3   0
chr2    3   10  2.1

input:

$ cat test.txt (added chr with double digits): 
chr1    0   2   0
chr1    2   6   1.5
chr10   2   6   1.5
chr11   2   6   1.5
chr2    0   3   0
chr2    3   10  2.1
chr20   3   10  2.1
chr21   3   10  2.1

ADD REPLY • link 5.8 years ago by cpad0112 21k

0

Entering edit mode

each of these two methods you mentioned runs successfully！

awk -F "\t" -v OFS="\t" '$1 ~ /^chr2$/' test.txt
awk '/chr2\t/' test.txt

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

0

Entering edit mode

For awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt if input test.txt file is

chr1 0 2 0   
chr1 2 6 1.5   
chr2 0 3 0   
chr2 3 10 2.1  
chr2 11 13 1.5

still no output, just like enter and shows nothing

ADD REPLY • link 5.8 years ago by 1106518271 ▴ 60

0

Entering edit mode

copy/pasted from your code: For awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt if input test.txt file is

$ awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1
chr20   3   2.1
chr20   4   2.1
chr20   5   2.1
chr20   6   2.1
chr20   7   2.1
chr20   8   2.1
chr20   9   2.1
chr21   3   2.1
chr21   4   2.1
chr21   5   2.1
chr21   6   2.1
chr21   7   2.1
chr21   8   2.1
chr21   9   2.1

ADD REPLY • link 5.8 years ago by cpad0112 21k

0

Entering edit mode

For chr2 only (copy/pasted from your code, but added \t after chr2):

$ awk -F "\t" -v OFS="\t" '/chr2\t/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1

ADD REPLY • link 5.8 years ago by cpad0112 21k

score 3 · Accepted Answer · 2018-07-10

3

Entering edit mode

5.8 years ago

Alex Reynolds 35k

If you want to use set operations:

$ bedops --chop 1 foo.bed | bedmap --faster --echo --echo-map-id --delim "\t" - foo.bed | cut -f1,2,4 > answer.txt

ADD COMMENT • link 5.8 years ago by Alex Reynolds 35k