Hi,
I would like to convert operon names to gene names (and the reverse). I think this should be possible with a regex, but I'm not fluent enough with regexes to crack it up.
Conventionally, operons are named like this:
genes operon_name strand
oneA,oneB,oneC oneABC +
oneA,oneB,oneC oneCBA -
oneA,oneB,twoD oneAB-twoD +
Occasionally operons can also come out as "oneA-oneB-oneC" or "someID-someotherID".
Any tip on how to get this to work, preferably in R? It doesn't have to work in all cases, but it'd help a lot if it allowed me to reduce the amount of manual intervention.
Thanks a lot.