This event has ended. Visit the official site or create your own event on Sched.
Thank you for participating in Big Data Tech 2018! Access conference presentation decks here.
View analytic
Tuesday, June 5 • 10:00am - 10:30am
Optimizing Pandas for Beginners

Log in to save this to your schedule and see who's attending!

Feedback form is now closed.
The Pandas library for Python has gained a lot of traction in recent years as a powerful alternative to R for analytics and data science needs. However, it continues to have something of a reputation for being ""slow"" - an issue that, most of the time, is a result of insufficient thought being put into optimizing the code appropriately. This talk will review some of the most common beginner pitfalls that can cause otherwise perfectly good Pandas code to grind to a screeching halt, and walk through a set of tips and tricks to avoid them. Using a series of examples, we will review the process for identifying the elements of the code that may be causing a slowdown, and discuss a series of optimizations: avoiding inefficient iterations, vectorizing functions on Pandas dataframes, and taking advantage of speed-ups offered by Cython.

This talk is aimed at beginner and intermediate Pandas users who have some experience with basic Pandas functionality, but have an interest in making their data analysis scripts cleaner and faster. The audience should expect to walk away from the talk with a set of practical tips, tricks, and examples for writing highly efficient code for Pandas dataframes.

avatar for Sofia Heisler

Sofia Heisler

Data Scientist, Mapbox
Sofia is a Data Scientist at Mapbox, where she helps develop algorithms for the company's navigation product. Before Mapbox, Sofia was a Lead Data Scientist at Upside Travel, where she built systems for optimizing inventory presentation and pricing, worked on predicting demand for... Read More →

Tuesday June 5, 2018 10:00am - 10:30am
P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431