Tag: apache spark
Google Brings Kubernetes Operator for Spark to GCP
Those looking to run Apache Spark on clusters managed with Kubernetes will be interested in the new Spark operator for Kubernetes unveiled by Google today. The software, which is in beta, will be supported on the Google Read more…
Google Updates Cloud Database, Developer Tools
Google unleashed a batch of updated tools this week aimed at cloud-based big data and storage options along with the beta release of a developer tool designed to ease use of Apache Spark with the R programming language. Read more…
Databricks Upgrades Spark Support, Adds ML Runtime
Databricks announced support this week for the latest version of Spark, integrating it into its enterprise analytics platform. Along with support for version 2.4 of the stream processing framework integrated as part of D Read more…
Databricks, Talend Expand Cloud Access to Spark
Databricks and Talend, the cloud data integration vendor, are joining forces to help data jockeys scale their integration efforts using the Apache Spark analytics engine hosted on Talend’s cloud. Databricks, the cre Read more…
Hot DataRobot Raises a Bundle
The bucks keep rolling in from technology investors pouring cash into machine learning and data science startups. The latest beneficiary is DataRobot, the machine learning automation vendor. Formed in 2012, the compan Read more…
New Open-Source Projects Emerge for Machine Learning
Two open-source projects contributed by Chinese tech giants Baidu and Tencent will focus on machine and deep learning advances with the long-term goal of making the AI technologies easier to use while advancing cloud ser Read more…
Anaconda: Data Science Exiting Hadoop for the Cloud
Data scientists are embracing cloud-native frameworks as they move on from on-premises data infrastructure previously dominated by Hadoop, concludes a survey on the state of data science. The shift is driven in part b Read more…
Databricks Open Sources MLflow to Simplify Machine Learning Lifecycle
Databricks today unveiled MLflow, a new open source project that aims to provide some standardization to the complex processes that data scientists oversee during the course of building, testing, and deploying machine le Read more…
Project Hydrogen Unites Apache Spark with DL Frameworks
The folks behind Apache Spark today unveiled Project Hydrogen, a new endeavor that aims to eliminate barriers preventing organizations from using Spark with deep learning frameworks like TensorFlow and MXnet. It's tou Read more…
Google Cloud Adds Cask Data
Leading cloud providers continue to snap up analytics startups with an eye toward expanding access to big data technologies. Cask Data, developers of an application platform that among other things integrates Hadoop and Read more…
Apache Zeppelin Launches Latest Data Science Notebook
ZEPL, the startup founded by the creators of interactive data analytics tool Apache Zeppelin, has moved its multi-tenant analytics platform out of beta, announcing its general availability this week. The platform is a Read more…
Top 3 New Features in Apache Spark 2.3
It's tough to find a big data project that's had as much impact as Apache Spark over the past five years. The folks at Databricks, who contribute heavily to Spark (along with the wider Spark community) are keeping the pr Read more…
Data Lakes Crest In Drive to Boost Quality
As more data moves to the cloud, the composition of data lakes is shifting to new sources such as NoSQL databases while cloud data repositories emerge amid hybrid deployments, according to a big data survey. The year- Read more…
The Data Science Behind Dollar Shave Club
Dollar Shave Club burst onto the men's hygiene scene in 2011 with a hilarious video and preposterous business plan: selling subscriptions for razor blades at a ridiculously low price. Six years later, the company keeps g Read more…
Databricks, Flush With Cash, Steers Spark at AI
Momentum around the Apache Spark cluster computing framework continues to build with the announcement of hefty late-stage funding round that will help push the analytics platform and related artificial intelligence appli Read more…
Open Source Tool Emerges For Cyber Defense
As banks, hospitals and retailers continue to lose ground to hackers, the open source community has stepped into the fray with a cyber security project designed to bring advanced analytics to IT monitoring data. The incu Read more…
GigaSpaces Closes Analytics-App Gap With Spark
Data analytics and cloud vendors are rushing to support enhancements to the latest version of Apache Spark that boost streaming performance while adding new features such as data set APIs and support for continuous, real Read more…
IBM Bolsters Spark Ties with Latest SQL Engine
IBM is extending its commitment to Apache Spark as a key component of in-memory analytics with the latest release of its SQL engine for Hadoop. The new version of IBM Big SQL released last week also solidifies the com Read more…
NEC Claims Vector CPU Outperforms Spark
An arms race is shaping up in the machine-learning sector with the claim by NEC Corp. that its approach based on its vector processor accelerates data processing by more than a factor of 50 compared to the Apache Spark c Read more…
Spark’s New Deep Learning Tricks
Imagine being able to use your Apache Spark skills to build and execute deep learning workflows to analyze images or otherwise crunch vast reams of unstructured data. That's the gist behind Deep Learning Pipelines, a new Read more…