iPython Notebook, Scala and Spark

by 10/31/2014 09:12:00 AM 0 comments
I attended Big Data Tech Con just this past week. Hadoop is still king of the Big Data World, but Spark is quickly becoming the tool of choice. I attended several very good classes on Spark at the conference. Some of these classes used Scala (via sbt) for the Spark exercises. One of the classes focused on using Spark for machine learning with the exercises in iPython notebooks. This was pretty cool for learning purposes. However, I much prefer working in Scala to Python. Fortunatly, there's the IScala project which allows you to run Scala code in iPython notebooks. I put together an example project on github for those who are interested in trying Spark/Scala notebooks at https://github.com/hohonuuli/sparknotebook. Here's a screenshot of the same code in python and Scala running side by side:



Cras justo odio, dapibus ac facilisis in, egestas eget quam. Curabitur blandit tempus porttitor. Vivamus sagittis lacus vel augue laoreet rutrum faucibus dolor auctor.