I recently released a 'no code' data manipulation tool for Windows and Mac. I am from a software engineering/physics background, so I didn't really think about bioinformatics as a possible use for the tool. However I now have several customers using it to manipulate DNA and protein sequences. So it would be interesting to find out a bit more about this application area. A few questions if you don't mind.
Do you use 'no-code tools' for data manipulation, such as joining, filtering, sorting, pivots etc? Or do you prefer a programming approach?
If you do use 'no code' tools, which ones and what do you like and dislike about them?
What file formats do you mainly use for storing and exchanging bioinformatics data (currently I support CSV, TSV, Excel, JSON and XML).
Excel mangles gene names like Sept3 and Mar2. Smart bioinformaticists will never put anything in to Excel if they can help it for that reason.
I am all too familiar with how Excel mangles dates and numbers! ;0)
Never, not irreproducible and typically does not scale well with large amounts of data in the gigabyte range.
Yep, something scripted which does not mess with gene names like Excel and company.
Excel is a horrible tool for manipulating data. No disagreement there!
How big are your typical datasets in terms of rows x columns (assuming it is tabular data)?
In fact I do not even know since I never store my single-cell data (sparse matrix formats) as plain text. For other more standard datasets (so raw data are Gigabytes) it is something like 15.000 rows times < 100 columns, for other genomic applications it can also be 150.000 rows times < 100 columns. I personally would never edit any of it using an editor, only one tab or whitespace being messed up can cause issues.