Practical guide to designing implants for pandas a talk by Jan Pipek
Friday, 14 June, 15:40 in Club
Since version 0.23, the
pandas library allows using custom user types for internal representation in series and data frames by introducing the ExtensionArray
and ExtensionDtype
interfaces (in places where a NumPy array would be used). Version 0.24 brings that forward by implementing all its “exotic” types in terms of the mentioned interfaces.
This has two main basic use cases: to make effective use of a data storage library or proxy (like Apache Arrow in the fletcher project), and to capture more complex objects seamlessly in pandas columns (like IP addresses in cyberpandas or topographical objects in geopandas).
The talk will explore the possibilities of extension arrays and will gradually build towards a simple proof-of-concept custom column.
Jan Pipek
I am a data scientist and engineer at Showmax, helping neural networks understand what happens in movies and building a video streaming platform for Africa. I only recently converted from Monte Carlo simulations in medical physics.
I've been using Python for more than ten years, with a strong inclination for data analysis and visualization (having written several useless and hopefully at least one useful library – physt), but also trying to enjoy the language in the broader sense.
I am both happy and fortunate to be one of the PyData Prague meetup organizers.
I also lead a workshop The Data Trinity – Practical NumPy, pandas and Matplotlib