I have the following two datasets:
ECM Proteomics Dataset
ECM Isoform Dataset
I would like to merge the TGE amino acid sequence and the peptide sequence but these two dataset do not share a unique identifier for each row.
How can I please create a unique identifier for each row for each dataset so that I can correctly merge the right amino acid sequence to the correct peptide sequence?
For the ECM Proteomics Dataset, I did the following in Pandas. This created a unique ID based on all the columns
ECM['id'] = ECM.groupby(['Gene.Symbol','Division','Category','PI','Protein.Name..name.of.reference.protein.', 'Protein.description','Sequence..TGE.amino.acid.seq.']).ngroup()
How can I assign the same exact unique IDs to the same exact rows to the other dataset please?
If you can please give me examples in R or Python that would greatly be appreciated.