Question: how to keep column 6 (normalized tag count) in peaks.txt file called by Homer callpeaks after pos2bed manipulation?
0
gravatar for lumingflank1
4 weeks ago by
Australia
lumingflank10 wrote:

I use HOMER to call peaks getting peaks.txt file. Then I use pos2bed.pl to transform peaks.txt to peaks.bed However, the column 6 loss after the transform, which showed the normalized tag count (equal to RPKM reflecting peak density).

chip-seq • 213 views
ADD COMMENTlink modified 4 weeks ago by prakash100 • written 4 weeks ago by lumingflank10
1
gravatar for prakash
4 weeks ago by
prakash100
INDIA
prakash100 wrote:

simple "grep" and "awk" can do your job.

grep -v "#" peak.txt |cut -f 1,2,3,4,6 | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak.bed

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by prakash100

Thank you, I use these code, and the column 6 will be kept after bedtools intersect

cut -f 1,2,3,4,6 peaks.txt | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak1.bed
pos2bed.pl peaks.txt > peak2.bed
awk 'NR==FNR {h[$4] = $5; next} {print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t"h[$4]}' peak1.bed peak2.bed >peaks.bed
chr11   117467921   117468098   chr11-2 1   +   86.8
chr17   39636555    39636732    chr17-2 1   +   85.6
chr2    231281278   231281455   chr2-2  1   +   83.3

but still 1 questions: "#" mean any pattern I can input, is that right? I didnot use it

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by lumingflank10
1

but still 1 questions: "#" mean any pattern I can input, is that right? I didnot use it

yes, within double quote, you can use any pattern. in this case, line with comment in peak file i.e "#" is not required, so to filter it, "grep -v "#" has been used.

ADD REPLYlink written 29 days ago by prakash100

why we have to clear lines with #, which didnot impact the intersect manipulation and result? even in homer's pos2bed.pl .txt >.bed, the new .bed file keeps the lines with #

ADD REPLYlink written 20 days ago by lumingflank10
1

you can further shorten the code:

cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

cut will take range and awk can take delimiter to all columns. IMO, that much code is not necessary. Please try the following:

OP:

cut -f 1,2,3,4,6 peaks.txt | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak1.bed

New code if you have lines with #:

grep -v "#" peak.txt |cut -f 2-4,1,6 > peak1.bed

New code if you do not have lines with #:

cut -f 2-4,1,6 peak.txt > peak1.bed
ADD REPLYlink modified 29 days ago • written 29 days ago by cpad01123.0k
1

grep -v "#" peak.txt |cut -f 2-4,1,6 > peak1.bed

Actually, using this code, order of column will not be changed. So, yes below shorter code which you mentioned will solve the purpose.

cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"
ADD REPLYlink modified 29 days ago • written 29 days ago by prakash100
1

oops...I didn't see 5th column missing.

$ cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

should be

$ cut -f 1-4, 6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"
ADD REPLYlink modified 28 days ago • written 29 days ago by cpad01123.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1659 users visited in the last hour