PyCon CZ 23
15–17 September

Let’s not bother researchers with the infrastructure a talk by Jiří Bajer

Friday 15 September 13:20 (30 minutes)

Nowadays, startups implement machine learning code in haste, with focus on the research part and using as little engineering as possible, just to ship a MVP. When successful, those MVPs quickly evolve into services with infrastructure code mixed with ML algorithms, with use cases buried deep in implementation details and with several, slightly different re-implementations of concerns like consuming from a message broker, liveness probe or shutdown signal handling.

Keeping such service healthy in the production costs the researchers a lot of time, which could be better spent on the machine learning part. Let's see where to draw the boundary between domain-specific machine learning code/use cases and domain-agnostic boilerplate using the Actor model to hide the infrastructure concerns from fellow researchers.

As not everything is roses, we'll mention where the Actor model requires a bit of wrestling with existing libraries and frameworks (stdlib's HTTPServer, Prometheus client, gRPC, Alembic and Gunicorn).

What do you need to know to enjoy this talk

Python level

Medium knowledge: You use frameworks and third-party libraries.

About the topic

You use it or do it on a regular basis.

Jiří Bajer

I spent last 7 years working on machine learning SaaS products as a backend/platform Python engineer, helping startups to pay off technical debt accumulated in the initial gold rush phase.

I am currently migrating my employer's production codebase to a 3rd generation of the service boilerplate based on the Actor model. Each of the generations was used by a different company and included a different set of server and client technologies, with own gotchas and headaches.

