Reformat my txt file
1
0
Entering edit mode
18 months ago
tianshenbio ▴ 120

I have a txt file that looks like this:

A a,b,q
B d
D f,m


how can I convert it to:

A a
A b
A q
B d
D f
D m

RNA-Seq gene ontology • 422 views
0
Entering edit mode

how is it related to bioinformatics ??

3
Entering edit mode
18 months ago
bruce.moran ▴ 880

There are lots of ways, you should pick a language you would like to learn and then see if you can use it.

I picked Perl because the Python course was full.

This is 'command line' Perl which is really useful for this kind of simple parsing:

perl -ane '@s=split(/\,/, $F[1]); foreach$k (@s){print "$F[0]$k\n";}' txt.txt

0
Entering edit mode

Hi,

Thank you for your reply. What if there is a space after each comma?

0
Entering edit mode

Please let us know how this question is related to bioinformatics or the post will be closed.

0
Entering edit mode

I think it's a toy example, and the A, B etc are indicative. User has some genuine bioinformatics q's.

Fair enough if you want to close it.

0
Entering edit mode

I'm sure that is the case, but OP needs to add details on how this is related to bioinformatics for one simple reason - this might be a known or intermediate file format that others might encounter, and they may have the same question. Without context, there is no way they can locate this post and use your answer. In essence, OP's post as it is right now only has value to them and not to the community at large, and we do not encourage such posts.

0
Entering edit mode

If you have this

A b, c, d, e
B f, g, h,


Then awk -F',| ' '{for (i=2;i<NF+1;i++){if (length($i) > 0){print$1,$i}}}' txt.txt It even solves this: A b,c, d,e,f, g,h, i, o B h,d, y,u, i, o, C h f d,d g k l  Or even if you have multiple spaces or comas between two letters, it will have the same output. ADD REPLY 0 Entering edit mode Put a space after the comma in split: @s=split(/\,/,$F[1])


to

@s=split(/\, /, \$F[1])

0
Entering edit mode

Please do not answer questions that are unrelated to bioinformatics. We are not StackOverflow. If OP is not willing to show us how their question is related to bioinformatics, do not encourage them.