Reformat my txt file
1
0
Entering edit mode
4.0 years ago
tianshenbio ▴ 170

I have a txt file that looks like this:

A a,b,q
B d
D f,m

how can I convert it to:

A a
A b
A q
B d
D f
D m
RNA-Seq gene ontology • 1.1k views
ADD COMMENT
0
Entering edit mode

how is it related to bioinformatics ??

ADD REPLY
3
Entering edit mode
4.0 years ago
bruce.moran ▴ 960

There are lots of ways, you should pick a language you would like to learn and then see if you can use it.

I picked Perl because the Python course was full.

This is 'command line' Perl which is really useful for this kind of simple parsing:

perl -ane '@s=split(/\,/, $F[1]); foreach $k (@s){print "$F[0] $k\n";}' txt.txt
ADD COMMENT
0
Entering edit mode

Hi,

Thank you for your reply. What if there is a space after each comma?

ADD REPLY
0
Entering edit mode

Please let us know how this question is related to bioinformatics or the post will be closed.

ADD REPLY
0
Entering edit mode

I think it's a toy example, and the A, B etc are indicative. User has some genuine bioinformatics q's.

Fair enough if you want to close it.

ADD REPLY
0
Entering edit mode

I'm sure that is the case, but OP needs to add details on how this is related to bioinformatics for one simple reason - this might be a known or intermediate file format that others might encounter, and they may have the same question. Without context, there is no way they can locate this post and use your answer. In essence, OP's post as it is right now only has value to them and not to the community at large, and we do not encourage such posts.

ADD REPLY
0
Entering edit mode

If you have this

A b, c, d, e
B f, g, h,

Then awk -F',| ' '{for (i=2;i<NF+1;i++){if (length($i) > 0){print $1,$i}}}' txt.txt

It even solves this:

A b,c, d,e,f, g,h, i, o
B h,d, y,u, i, o,
C h  f d,d g      k    l

Or even if you have multiple spaces or comas between two letters, it will have the same output.

ADD REPLY
0
Entering edit mode

Put a space after the comma in split:

@s=split(/\,/, $F[1])

to

@s=split(/\, /, $F[1])
ADD REPLY
0
Entering edit mode

Please do not answer questions that are unrelated to bioinformatics. We are not StackOverflow. If OP is not willing to show us how their question is related to bioinformatics, do not encourage them.

ADD REPLY

Login before adding your answer.

Traffic: 2799 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6