Entering edit mode
9 months ago
deniselavezzari • 0
I'm working a table like this:
and I need to access each one of the "GenBank Accessions" number, compare it to the second table "Info".
Then, for each row in the first table I want to add the count of the specific species. For example, in the first row of the table one I would have
Enterovirus A: 4 Enterovirus G: 2
How can I do that?
Thanks a lot!
What have you tried? Have you Googled anything?
I'm trying for-loops, but only the first element of each row is exactly compared to the second table.
Maybe I need to do the opposite, it might be easier...
I've undeleted your question as it has already received some feedback from a couple of users.
Your description of the data is minimal and confusing, so we can't really help with the code. Can you explain what you wish to achieve in plain terms such as :
Are these real tables or CVS data files? You can use Python pandas library for this data merge. Create a dataframe for each table, loop through them, position the required value in table 1 and find the other value in table 2, and so on. At end build a new CSV file with the found desired values. In Machine Learning this is call data preprocessing or Exploratory Data Analysis (EDA). We do that all the time in any Machine Learning project. I hope this clarifies your task.
This does not answer OP's question - it's just a detour into how the genre of OP's question is common among ML projects. As such, I've moved it to a comment.
Data cleaning is common in any data analysis project, not just ML projects. Please stop pushing ML everywhere.