Entering edit mode
4.7 years ago
max_19
▴
170
Hi all!
I'd like to convert prodigals standard protein translation output file into a tab delimited file with just some key variables as shown below:
Prodigal output:
>1234_1 # 3 # 506 # 1 # ID=4_1;partial=11;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.554
HVRPRKPRLLPHHRLDRLSFARNPLPTPEDTWQDVVFTDESKFNLFGSDGPKTVW
REPGPPTQDYHIIETVKYGGGSVMAWGAITSRGVGALVFIETTMDAKVFVEVLESGLNET
LEKKHLKVKDVILQQDNDPK
>5678_1 # 3 # 470 # 1 # ID=6_1;partial=10;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.472
SWNDRLEQATKAVNMSFHRGLRTSPYIFKHGYLPDLKCDAKHGKVRMSRDRLQ
AKHIRDRNYDYYTEKSIVKGKREITEEFPIGTPVAIFKRQ
Prefered output:
ID1 CDS_start CDS_end Strand ID2
1234_1 3 506 1 4_1
5678_1 3 470 1 6_1
Anyone know a quick and easy way to accomplish this? I have tried somethings using "awk" but was not successful.
Thank you for any input.