how can I transfer this different steps bedgraph format into 1bp file?
3
1
Entering edit mode
5.8 years ago
1106518271 ▴ 60

Can I transfer this first format to the second one just by basic shell procession or awk or sed on linux? This is a toy example:

This kind of text file is what I have, three cols, col2 and col3 like range, left close and right open,
chr1 0 2 0
chr1 2 6 1.5
chr2 0 3 0
chr2 3 10 2.1

Transfer to describe each position as:
chr1 0 0
chr1 1 0
chr1 2 1.5
chr1 3 1.5
chr1 4 1.5
chr1 5 1.5
chr2 0 0
chr2 1 0
chr2 2 0
chr2 3 2.1
...
chr2 9 2.1

Someone has idea how to solve? Thanks!!!

atac-seq linux • 1.9k views
ADD COMMENT
0
Entering edit mode

Hello 1106518271!

It appears that your post has been cross-posted to another site: https://stackoverflow.com/questions/51259160

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode

Sorry, I wait a quite long time, and thought I wouldn't get answer on that site. But I will surely notice you kind reminder.

ADD REPLY
4
Entering edit mode
5.8 years ago
awk '{B=int($2);E=int($3);for(i=B;i<E;++i) printf("%s\t%d\t%s\n",$1,i,$4);}' in.bed
ADD COMMENT
0
Entering edit mode

It's already running out,exactly correct!

ADD REPLY
4
Entering edit mode
5.8 years ago

Try:

awk -F "\t" -v OFS="\t" '{while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt

or

awk -F "\t" -v OFS="\t" '{for ($2=$2; $2<$3;$2++) {print $1,$2,$4}}' test.txt

output:

chr1    0   0
chr1    1   0
chr1    2   1.5
chr1    3   1.5
chr1    4   1.5
chr1    5   1.5
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1
ADD COMMENT
0
Entering edit mode

Thanks for your flexible ideas!
If you are interest, could I ask that for this format (since if I tranfer whole file to 1bp will too large according the result),

chr1 0 2 0 
chr1 2 6 1.5 
chr2 0 3 0 
chr2 3 10 2.1
chr2 11 13 1.5

to extract lines meet chr2 1-7, it will be:

chr2 0 3 0 
chr2 3 10 2.1

Can I use like awk '/chr2/'|awk -F':' '**/20080501[2-9]/**' file

ADD REPLY
1
Entering edit mode

To get an ad-hoc subset, you can use set operations in a Unix pipeline:

$ echo -e 'chr2\t1\t7' | bedops -e 1 foo.bed - | bedops --chop 1 - | bedmap --faster --echo --echo-map-id --delim "\t" - foo.bed | cut -f1,2,4 > answer.txt

The --faster option can be used here, as none of your example elements are nested.

ADD REPLY
0
Entering edit mode

Thanks, seems well worth learning bedops!

ADD REPLY
1
Entering edit mode

@OP: You can shorten code:

$ awk -F "\t" -v OFS="\t" '/chr2/' test.txt
chr2    0   3   0
chr2    3   10  2.1

However, subsetting can be done in line as below:

$ awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1
ADD REPLY
1
Entering edit mode
awk -F "\t" -v OFS="\t" '{while ($2 < $3) { print $1,$2,$4 > $1".txt"; ++$2 }}' test.txt

This would create one file per chromosome (chr1.txt, chr2.txt) and each file will have corresponding data.

ADD REPLY
0
Entering edit mode

For this, I tried for my large file contains chr1-chr22 and chrX chrY, these commands will also output chr22 line, not just chr2

$ awk -F "\t" -v OFS="\t" '/chr2/' test.txt

awk '/chr2/'
ADD REPLY
0
Entering edit mode

try awk -F "\t" -v OFS="\t" '$1 ~ /^chr2$/' test.txt or awk '/chr2\t/' test.txt

ADD REPLY
0
Entering edit mode

if did that, no output now

ADD REPLY
1
Entering edit mode
$ awk -F "\t" -v OFS="\t" '$1 ~ /^chr2$/' test.txt
chr2    0   3   0
chr2    3   10  2.1

another method:

$ awk '/chr2\t/' test.txt
chr2    0   3   0
chr2    3   10  2.1

input:

$ cat test.txt (added chr with double digits): 
chr1    0   2   0
chr1    2   6   1.5
chr10   2   6   1.5
chr11   2   6   1.5
chr2    0   3   0
chr2    3   10  2.1
chr20   3   10  2.1
chr21   3   10  2.1
ADD REPLY
0
Entering edit mode

each of these two methods you mentioned runs successfully!

awk -F "\t" -v OFS="\t" '$1 ~ /^chr2$/' test.txt
awk '/chr2\t/' test.txt
ADD REPLY
0
Entering edit mode

For awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt if input test.txt file is

chr1 0 2 0   
chr1 2 6 1.5   
chr2 0 3 0   
chr2 3 10 2.1  
chr2 11 13 1.5

still no output, just like enter and shows nothing

ADD REPLY
0
Entering edit mode

copy/pasted from your code: For awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt if input test.txt file is

$ awk -F "\t" -v OFS="\t" '/chr2/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1
chr20   3   2.1
chr20   4   2.1
chr20   5   2.1
chr20   6   2.1
chr20   7   2.1
chr20   8   2.1
chr20   9   2.1
chr21   3   2.1
chr21   4   2.1
chr21   5   2.1
chr21   6   2.1
chr21   7   2.1
chr21   8   2.1
chr21   9   2.1
ADD REPLY
0
Entering edit mode

For chr2 only (copy/pasted from your code, but added \t after chr2):

$ awk -F "\t" -v OFS="\t" '/chr2\t/ {while ($2 < $3) { print $1,$2,$4; ++$2 }}' test.txt
chr2    0   0
chr2    1   0
chr2    2   0
chr2    3   2.1
chr2    4   2.1
chr2    5   2.1
chr2    6   2.1
chr2    7   2.1
chr2    8   2.1
chr2    9   2.1
ADD REPLY
3
Entering edit mode
5.8 years ago

If you want to use set operations:

$ bedops --chop 1 foo.bed | bedmap --faster --echo --echo-map-id --delim "\t" - foo.bed | cut -f1,2,4 > answer.txt
ADD COMMENT

Login before adding your answer.

Traffic: 1460 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6