Bedtools with name and full header
1
1
Entering edit mode
5.1 years ago
bharata1803 ▴ 520

Hello,

So, I have a bed file and a fasta reference file. I want to generate a fasta file corresponding to the bed file. I have success generating this but I just have a small problem. First, I use the command :

bedtools getfasta -fi ref.fa -bed in.bed -fo out.fa -name -s -fullHeader

And I get the result like this:

>GENENAME
CTGATGATAGATAG

Second, I use command:

bedtools getfasta -fi ref.fa -bed in.bed -fo out.fa -s -fullHeader

I get the result like this

>1:13343-22332(+)
CTGATGATAGATAG

What I want to get is the complete header like below

>GENENAME range=1:13343-22332 5'pad=0 3'pad=0 strand=+ repeatMasking=none
CTGATGATAGATAG

I know, I can write my own code to adjust the result like I want but probably someone here knows how to generate the result like I want. The bed ffile is a tab-separated file with header like below:

Chromosome     Start    End   GeneName   Length   Strand

Thank you for any suggestion.

bed • 2.5k views
ADD COMMENT
0
Entering edit mode
2.5 years ago
venu 6.8k

For future reference,

modify your bed file as following and use your first bedtools command

cat foo.bed | awk -F'\t' '{print $1 "\t" $2 "\t" $3 "\t" $4 " range="$1 ":"$2"-"$3 " " $5 " strand="$6 " repeatMasking=none"}' > new_foo.bed

As you can see, bedtools is using 4th column as fasta header, so if you dump required information into 4th column with space separation, it gets you what you need.

ADD COMMENT

Login before adding your answer.

Traffic: 2996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6