Blast to bed for subjects
1
1
Entering edit mode
4 months ago
QLFblaireau ▴ 30

Hello,

I would like to convert a blast tabular format to bed format, I am interested in the subject ID, start and end, so columns 2, 9, 10. Hence it seemed to me this simple awk would do:

awk '{print($2"\t"$9-1"\t"$10)}' file.blastn > file.bed  However, when attempting a bedtools intersect bedtools intersect -a file.annotation.gff -b file.bed Error: unable to open file or unable to determine types for file file.bed - Please ensure that your file is TAB delimited (e.g., cat -t FILE). - Also ensure that your file has integer chromosome coordinates in the expected columns (e.g., cols 2 and 3 for BED).  I did check that the file is indeed TAB delimited. However, I realised that the coordonates difference between columns 10 and 9 was sometimes negative. Example  awk '{print$3 -\$2}' AZF180.bed|head
-200
-214
-253


It's not clear how I should deal with that. Simply swapping the columns such as rows with negative sum values become positive?

Thanks for the insight

conversion bed bedtools blast • 307 views
0
Entering edit mode

Hmmmmmmmmm!

Maybe you can try to include strand information.

3
Entering edit mode
4 months ago

I'll assume you have tabular blast output (if not then you should re-run and ask for tab output of blast)

In the tabular output if the alignment is forward against reverse the coordinates will indeed be large to small (instead of small to large). To have correct bed format you will need to switch those coordinates around indeed.

this can be achieved in the same awk cmdline you already used, expanding it a bit (tip: if-condition)

1
Entering edit mode

Yes indeed, I swapped the column when the difference was negative and bedtools stop complaining. It kind of make sense but naively I thought bedtools was able to read from left to right and right to left.