A couple of practical questions about ‘Machine Learning’.

My new self-assignment.

Now a couple of weeks ago I started researching in and around this service, called kibernetika.ai. It lets you organize and log the ‘training’ process of your models in a relatively straighforward way, then deploy the microservice based on the newly trained model and make it available through the API. I got a demo Kubernetes cluster there where I can (for the first time) experiment simultaneously with multiple models on several frameworks, namely: Tensorflow, PyTorch, Spark/Pyspark/Toree; Intel’s BigDL and OpenVINO… and I can, probably, try to create my own Kubernetes pod from my own Docker image too (haven’t tried it yet), but…! Here’s the problem: in the same way as with ‘clouds’ it takes a lot of time and effort to learn or re-learn (for the hundred-th time) one of these so called ‘frameworks’, if you know what I mean. And it’s even harder to experiment with several of them in parallel, trying to figure out which one of them is best for your particular problem. The main tragedy, as always is in the fact that all these groups are trying to appropriate words and combinations of words and make them a part of their ‘brand’. Each ‘company’ invents its’ own ‘monkey language’ and writes a voluminous ‘documentation’ in it. I find it to be, probably, the most annoying aspect of everyday work Today. Not that I haven’t seen this technique of ‘privatisation’ of words by the so called ‘schools’ in Physics when I was young, but the amount of this ‘inventiveness’ Today surpasses everything I’ve seen before by an order of magnitude at least.

The real problem of the day

The real problem of Today is of course the analytic generalization of the (purely experimental) empirical knowledge obtained from all these massive amounts of data readily available and transferable at almost no cost and with a very little overhead. With physics it was simpler: the behavior of relatively wide classes of systems could be described by relatively simple set of equations describing the idealized (mathematical) model of physical system involving the separable components that could be idealized (separately) too. Most of the aspects of this idealizations were/are pure postulates, like f.i. ‘material point’ or a point-sized ‘photon’ in Einstein’s speculation, the ‘field’ that exerts its’ action upon a point-sized ‘particle’ etc. etc.
This will not work with the ‘objects’ that we finally found the way to describe: language, game of chess as a whole, vision. It’s more or less obvious that the ‘trained’ models are this type of empirical description. In a certain way it is ‘meta’ related to any particular instance of a process: a phrase from a book or a particular game of chess with a position of pieces on the board. The complexity of this meta description is now too high to express it in any number of analytical formulas that can be further manipulated into other tautologies. Formulas were for particular instances of systems that were decomposabe, these phenomena are ‘measured’ as a whole. So, what do we do now?

Later.