Question: extracting strand information from BAM to BED
0
gravatar for amitpande74
6 months ago by
amitpande740 wrote:

Hi everyone,

I have used bedtools bamtobed to extract information. My file looks like this:

chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41  + 
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:22230:17587    41  +

I am loosing the strand information when I tried bedtools merge command to remove the repeated coordinates. chr10 1195932 1195979

What I would want is an output like this :

chr10 1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41 +

for all the duplicated regions in one chromosome.

Can someone help.

regards.

bam files bedtools • 195 views
ADD COMMENTlink modified 6 months ago by ATpoint32k • written 6 months ago by amitpande740
1

I can't guarantee that this will work in all cases, but maybe try bedtools merge -i test.bed -c 4,5,6 -o distinct|perl -pe 's/,\S+//g'. It assumes the file in your example is called test.bed.

ADD REPLYlink written 6 months ago by jean.elbers1.3k
2
gravatar for ATpoint
6 months ago by
ATpoint32k
Germany
ATpoint32k wrote:

A workaround could be to use the stranded option (-s) together with -o and -c:

cat test.bed 
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41  +
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:22230:17587    41  +
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492 41  -
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:22230:17587    41  -

bedtools merge -s -i test.bed -c 6 -o first
chr10   1195932 1195977 +
chr10   1195932 1195977 -

Essentially, it merges strand-specifically and then prints the first value (-o first) of the 6th columns (-c 6). Could be customized to include the first element of $5 and $6 pretty much as jean.elbers suggests above, but without summoning the Perl devil :-D

Also possible, simply merge without respect to strand and simply appending all the infos in $4, $5, and $6:

bedtools merge -i test.bed -c 4,5,6 -o collapse -delim "|"
chr10   1195932 1195977 NB551726:5:H2HT5BGXC:1:11101:11237:5492|NB551726:5:H2HT5BGXC:1:11101:22230:17587|NB551726:5:H2HT5BGXC:1:11101:11237:5492|NB551726:5:H2HT5BGXC:1:11101:22230:17587   41|41|41|41 +|+|-|-
ADD COMMENTlink modified 6 months ago • written 6 months ago by ATpoint32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 842 users visited in the last hour