Question: Bedtools with name and full header
1
gravatar for bharata1803
3.2 years ago by
bharata1803420
Japan
bharata1803420 wrote:

Hello,

So, I have a bed file and a fasta reference file. I want to generate a fasta file corresponding to the bed file. I have success generating this but I just have a small problem. First, I use the command :

bedtools getfasta -fi ref.fa -bed in.bed -fo out.fa -name -s -fullHeader

And I get the result like this:

>GENENAME
CTGATGATAGATAG

Second, I use command:

bedtools getfasta -fi ref.fa -bed in.bed -fo out.fa -s -fullHeader

I get the result like this

>1:13343-22332(+)
CTGATGATAGATAG

What I want to get is the complete header like below

>GENENAME range=1:13343-22332 5'pad=0 3'pad=0 strand=+ repeatMasking=none
CTGATGATAGATAG

I know, I can write my own code to adjust the result like I want but probably someone here knows how to generate the result like I want. The bed ffile is a tab-separated file with header like below:

Chromosome     Start    End   GeneName   Length   Strand

Thank you for any suggestion.

bed • 1.5k views
ADD COMMENTlink modified 7 months ago by Biostar ♦♦ 20 • written 3.2 years ago by bharata1803420
0
gravatar for venu
7 months ago by
venu6.1k
Germany
venu6.1k wrote:

For future reference,

modify your bed file as following and use your first bedtools command

cat foo.bed | awk -F'\t' '{print $1 "\t" $2 "\t" $3 "\t" $4 " range="$1 ":"$2"-"$3 " " $5 " strand="$6 " repeatMasking=none"}' > new_foo.bed

As you can see, bedtools is using 4th column as fasta header, so if you dump required information into 4th column with space separation, it gets you what you need.

ADD COMMENTlink written 7 months ago by venu6.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 933 users visited in the last hour