Question: extracting strand information from BAM to BED
0
gravatar for amitpande74
11 days ago by
amitpande740 wrote:

Hi everyone,

I have used bedtools bamtobed to extract information. My file looks like this:

chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41  + 
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:22230:17587    41  +

I am loosing the strand information when I tried bedtools merge command to remove the repeated coordinates. chr10 1195932 1195979

What I would want is an output like this :

chr10 1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41 +

for all the duplicated regions in one chromosome.

Can someone help.

regards.

bam files bedtools • 82 views
ADD COMMENTlink modified 11 days ago by ATpoint23k • written 11 days ago by amitpande740
1

I can't guarantee that this will work in all cases, but maybe try bedtools merge -i test.bed -c 4,5,6 -o distinct|perl -pe 's/,\S+//g'. It assumes the file in your example is called test.bed.

ADD REPLYlink written 11 days ago by jean.elbers1.3k
2
gravatar for ATpoint
11 days ago by
ATpoint23k
Germany
ATpoint23k wrote:

A workaround could be to use the stranded option (-s) together with -o and -c:

cat test.bed 
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41  +
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:22230:17587    41  +
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41  -
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:22230:17587    41  -

bedtools merge -s -i test.bed -c 6 -o first
chr10   1195932 1195977 +
chr10   1195932 1195977 -

Essentially, it merges strand-specifically and then prints the first value (-o first) of the 6th columns (-c 6). Could be customized to include the first element of $5 and $6 pretty much as jean.elbers suggests above, but without summoning the Perl devil :-D

Also possible, simply merge without respect to strand and simply appending all the infos in $4, $5, and $6:

bedtools merge -i test.bed -c 4,5,6 -o collapse -delim "|"
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492|NB551726:5:H2HT5BGXC:1:11101:22230:17587|NB551726:5:H2HT5BGXC:1:11101:11237:5492|NB551726:5:H2HT5BGXC:1:11101:22230:17587   41|41|41|41 +|+|-|-
ADD COMMENTlink modified 11 days ago • written 11 days ago by ATpoint23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 883 users visited in the last hour