Hienter code here
Does anyone have a script that converts output from rpsblast XML format to CSV format? Is this a fragment of my XML result?
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gnl|CDD|289286</Hit_id>
<Hit_def>pfam12505, DUF3712, Protein of unknown function (DUF3712). This domain family is found in eukaryotes, and is approximately 130 amino acids in length.</Hit_def>
<Hit_accession>289286</Hit_accession>
<Hit_len>124</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>93.4135</Hsp_bit-score>
<Hsp_score>233</Hsp_score>
<Hsp_evalue>9.11946e-22</Hsp_evalue>
<Hsp_query-from>846</Hsp_query-from>
<Hsp_query-to>970</Hsp_query-to>
<Hsp_hit-from>2</Hsp_hit-from>
<Hsp_hit-to>122</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>39</Hsp_identity>
<Hsp_positive>60</Hsp_positive>
<Hsp_gaps>4</Hsp_gaps>
<Hsp_align-len>125</Hsp_align-len>
<Hsp_qseq>PLGQIAMPNVSLAGDVGADLNIDAAFAVADVGHLTDFTTYLLTQPSFTWQIYGQNLAVSALGITVPGISILKNVVLDGMDGFKGLVKIESFDLPANDPAGGITLTLATSLTNPSSVGVALSQIGF</Hsp_qseq>
<Hsp_hseq>PFATVPLPGIKAAGN-GTTLVVDQTLDITDVDAFTDFAKALVFSESFTLSVKGKT-DLKLGGLPFSGVTLDKTVTLKGLNNLKG-FSITDFDLP-LPPADGINLVATATIPNPSVLTIELGNVTL</Hsp_hseq>
<Hsp_midline>P + +P + AG+ G L +D + DV TDF L+ SFT + G+ + G+ G+++ K V L G++ KG I FDLP PA GI L ++ NPS + + L + </Hsp_midline>
I would like a table in CSV in this form:
query id,subject id,% identity,alignment length,mismatches,gap opens,q. start,q. end,s. start,s. end,evalue,bit score,subject description S89_g3,gnl|CDD|109488,43.59,39,22,0,247,285,6,44,3.98E-05,548,457,pfam00432: Prenyltrans: Prenyltransferase and squalene oxidase repeat.
Because from it I can work in excel.
Have tried any regular blastXML to tab conversion scripts? For eg., https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/blastxml_to_tabular.py
Yes, but I need csv or xls format. Whit Pierre's script I'm almost succeeding.
well, if you have a tsv, you can open it directly in excel. Also, converting from tsv to csv is relatively simple with
sed
command:sed 's/\t/,/g' file.tsv > file.csv
Actually, I have a result of rpbsblast in xml and I want this result listed a table in xls cleanly, just as I exemplified in my first question. So I would like to convert the xml output or to tsv or to csv, so that would make it easy for me to use in xls. I have a script that works great when I use with the output of blastp (using BLAST +), but for output of rpbs blast does not work. I've tried to fix this but, unsuccessfully.