Question

How to write a proper bed file to extract sequence?

0

Entering edit mode

6.8 years ago

saadleeshehreen ▴ 140

Hi, I have manually created a bed file to extract the sequences from a fasta file. But it is showing the following error message. How can I solve it?

-bash-4.2$ cat pAcr_extract.bed
PSE305_1 20001 20479
PSE305_1 20306 20479
PSE305_1 20001 20303
AZPAE14907_contig_18_1 20001 20479
AZPAE14907_contig_18_1 20001 20303
WH-SGI-V-07178_contig3_1 20001 20303
WH-SGI-V-07178_contig3_1 20306 20479
bash-4.2$ bedtools getfasta -fi pAcr_extract.fasta -bed pAcr_extract.bed  -fo pAcr_extract.fasta.out
 It looks as though you have less than 3 columns at line: 1.  Are you sure your files are tab-delimited?

bed file covertor extract_seq • 1.8k views

ADD COMMENT • link updated 6.8 years ago by n,n ▴ 390 • written 6.8 years ago by saadleeshehreen ▴ 140

score 1 · Answer 1 · 2018-10-15

1

Entering edit mode

6.8 years ago

n,n ▴ 390

bedtools is complaining about your file not being tab-delimited, try the following on your file if you have awk in your terminal to avoid making it again manually with tabs since I'm assuming its a big file:

cat pAcr_extract.bed | awk 'BEGIN{OFS="\t";} {print $1,$2,$3;}' > pAcr_extract_tab.bed

now try using bedools again with the newly created file in the -bed option.

ADD COMMENT • link 6.8 years ago by n,n ▴ 390

0

Entering edit mode

Here a more general way for converting one ore more whitespaces into tabs using sed.

$ sed 's/  \+/\t/g' input > output

fin swimmer

ADD REPLY • link 6.8 years ago by finswimmer 16k