Hello everyone,
I have file1 which is tab delimited with following format :
g1  pfam    PF12    ABC transporter
g2  pfam    PF13    transcription factor
g3  pfam    PF14    glycosyl hydrolase
pfam    PF15    FAD binding domain  
g4  pfam    PF16    RTA1 like protein
pfam    PF17    Zinc-binding dehydrogenase  
pfam    PF18    major facilitator superfamily   
g5  pfam    PF19    short chain dehydrogenase
g6  pfam    PF20    ABC transporter
I want to arrange this file such that the lines beginning with pfam will include gene id(g3 or g4, etc.) from previous line. The output file that I want is also tab delimited and looks like this:
g1  pfam    PF12    ABC transporter
g2  pfam    PF13    transcription factor
g3  pfam    PF14    glycosyl hydrolase
g3  pfam    PF15    FAD binding domain
g4  pfam    PF16    RTA1 like protein
g4  pfam    PF17    Zinc-binding dehydrogenase
g4  pfam    PF18    major facilitator superfamily
g5  pfam    PF19    short chain dehydrogenase
g6  pfam    PF20    ABC transporter
Many thanks in advance.
Ambika
Nice one-liner. One question: You don't really nead
first=""in the BEGIN statement?Thank you. I will try that.