Data management in R, perl or python
3
0
Entering edit mode
12 months ago
MSRS ▴ 510

Hi, thank you for answering my problem.

My data is in format as below:

A1P
D4M
N6G
A1F
D4S
N6L
A1C


I want the output or output should be :

A1P/F/C
D4M/S
N6G/L


Is any R package code available? perl or python code will also be great. Thank you very much. Sorry for wasting your valuable time.

R perl python • 436 views
1
Entering edit mode

1
Entering edit mode

they want the suffix after the 2 letters collected and appended. This is a basic programming question for which they should show at least some effort in some language

0
Entering edit mode

English Alphabet! Basically, it will be used for amino acid (single letter code) and Nucleotide data formating.

1
Entering edit mode

I'd look out for residue positions >9 - that will result in total length being >3, and scripts below that don't account for it will fail. JC's solution will work best in that case.

1
Entering edit mode

yeah, I was thinking the OP could have a position >9 in the inputs

4
Entering edit mode
12 months ago
JC 12k

Perl:

#!/usr/bin/perl
use strict;
use warnings;
my %data;
while (<>) {
chomp;
if (m/(\w\d+)(\w)/) {
my $key =$1;
my $new =$2;
if (defined $data{$key}) {
$data{$key} .= "/$new"; } else {$data{$key} =$new;
}
}
}
while (my ($key,$aa) = each %data) {
print "$key$aa\n";
}


Test:

$perl comb.pl < list.txt A1P/F/C D4M/S N6G/L  ADD COMMENT 0 Entering edit mode Thank you, JC. Excellent! ADD REPLY 2 Entering edit mode 12 months ago python solution from collections import defaultdict result = defaultdict(str) for line in open("input.txt").readlines(): line = line.strip() result[line[:2]] = "/".join([result[line[:2]],line[-1]]) with open("output.txt","a") as file: for first,second in result.items(): file.write(first+second[1:]+"\n")  ADD COMMENT 0 Entering edit mode Thank you for sharing your scripts. ADD REPLY 0 Entering edit mode By the way, you don't need to bookmark every answer. You can bookmark the top level post, and that way you'll have access to all the answers. ADD REPLY 0 Entering edit mode Sorry for that. I will follow your instruction. Thank you very much for the correction. ADD REPLY 0 Entering edit mode Don't worry about it - it's not a "Don't do this", it's just "you don't need to". Our bookmarks section can get cluttered easily. ADD REPLY 2 Entering edit mode 12 months ago sed 's/^$$..$$/\1\t/' input.txt | datamash -t$'\t' -s -g 1  collapse 2
A1  P,F,C
D4  M,S
N6  G,L