An Open Road to Swift Dataframe Scaling

September 15th, 2020 |
Image for FaceBook

 
Share this post:
Facebook | Twitter | Google+ | LinkedIn | Pinterest | Reddit | Email
 
This post can be linked to directly with the following short URL:


 
The audio player code can be copied in different sizes:
144p, 240p, 360p, 480p, 540p, Other


 
The audio player code can be used without the image as follows:


 
This audio file can be linked to by copying the following URL:


 
Right/Ctrl-click to download the audio file.
 
Subscribe:
Connected Social Media - iTunes | Spotify | Google | Stitcher | TuneIn | Twitter | RSS Feed | Email
Intel - iTunes | Spotify | RSS Feed | Email
Code Together - iTunes | Spotify | Google | Stitcher | SoundCloud | RSS Feed | Email
 

Data scientists spend 60% of their time cleaning and preprocessing data, transforming this dirty data into crystallized insights. Dataframes, such as Pandas, provide exceptional tooling to address data wrangling tasks, yet Pandas themselves increasingly lack ease and speed as they scale. Alex Baden, Technical Director at OmniSci, and Devin Petersohn, Machine Learning Engineer at Intel, dive into the challenges and considerations of dataframe scaling. They explore how the Intel Modin / OmniSci solution, part of the Intel AI Analytics Toolkit, offers an open road to quick, transparent scaling across heterogeneous architectures. They also explain how this solution’s integration with the rest of the Python ecosystem enables data scientists to focus on extracting value from data rather than provisioning and orchestrating resources.

Guests:
Alex Baden, Technical Director, OmniSci
Devin Petersohn, Machine Learning Engineer, Intel

To learn more:
Intel AI Analytics Toolkit (Beta)
Installing Intel AI Analytics Toolkit (Beta)
Modin Discourse
OmniSciDB
OmniSciDB Repo
oneAPI

Transcript Read/Download the transcript.
 

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
 
Posted in: Audio Podcast, Code Together, Intel