On The Growth Of Apache Spark


Editor’s Note: Vaibhav Nivargi is the founder and chief architect of ClearStory Data, a data analytics service provider.

This week the fast-growing Apache Spark community is gathering in New York City to celebrate and collaborate on one of the most popular open source projects today.

Launched in U.C. Berkeley’s AMPLab in 2009, Apache Spark has begun to catch on like wildfire during the last year and a half. Spark had more than 465 contributors in 2014, making it the most active project in the Apache Software Foundation and among big data open source projects globally.

Early on, we bet on the cluster-computing platform ourselves, rather than building our own software from scratch.

Its in-memory, parallel processing power runs programs 100X faster than Hadoop MapReduce in memory and 10X faster on disk. This allows dozens of data sources to be blended and harmonized at once.

According to Gartner, 73 percent of organizations will invest in big data by…

View original post 625 more words

This entry was posted in Brian By Experience. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s