Loading…
This event has ended. Visit the official site or create your own event on Sched.
Thank you for participating in Big Data Tech 2018! Access conference presentation decks here.
View analytic
Tuesday, June 5 • 10:00am - 10:30am
Speeding up R with Multiple Cores and Clusters

Log in to save this to your schedule and see who's attending!

Feedback form is now closed.
"There are many workloads in R that can be accelerated with the use of additional computing power. Some — like group-by analyses, simulations, and cross-validation of models — are ""embarrassingly parallel"", and lend themselves easily to the simultaneous use of multiple cores or machines in a cluster. Others, like statistical modeling and machine learning routines, require the use of specialized algorithms designed for distributed systems.

In this talk, I'll describe packages available in R to accelerate workloads with the use of multiple cores and clusters. In particular, I'll discuss the use of the ""foreach"" package for embarrassingly parallel problems, and the ""sparklyr"" package for more complex problems with large data sets stored in Spark."

Speakers
avatar for David M Smith

David M Smith

Developer Advocate, Microsoft
David is Cloud Developer Advocate for Microsoft, specializing in the topics of artificial intelligence and machine learning. Since 2009 he has been the editor of the Revolutions blog  where he writes regularly about applications of data science with a focus on the programming language "R", and is also a founding member of the R Consortium. Follow David on Twitter as @revodavid... Read More →


Tuesday June 5, 2018 10:00am - 10:30am
Garden Room (Kopp A-B-C) Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431