Follow BigDATAwire:

Tag: Yahoo!

This Week in Graph and Entity Analytics

Yahoo, Cambridge Semantics, and Ravel Law are among the companies making news this week in the burgeoning field of graph and entity analytics. Wouldn't it be nice to automatically extract meaningful information from t Read more…

Yahoo Shares Algorithm for Identifying ‘NSFW’ Images

Yahoo is releasing the deep learning algorithm that it uses to detect "not safe for work" (NSFW) images to the open source community, the Web giant announced last week. Anywhere from 4% to 30% of the Internet is compo Read more…

Yahoo’s New Pulsar: A Kafka Competitor?

Yahoo today announced that it's open sourcing Pulsar, a new distributed "publish and subscribe" messaging systems designed to be highly scalable while maintaining low levels of latency. The bus already backs some of Yaho Read more…

Happy Birthday, Hadoop: Celebrating 10 Years of Improbable Growth

It's hard to believe, but the first Hadoop cluster went into production at Yahoo 10 years ago today. What began as an experiment in distributed computing for an Internet search engine has turned into a global phenomenon Read more…

Inside Yahoo’s Super-Sized Deep Learning Cluster

As the ancestral home of Hadoop, Yahoo is a big user of the open source software. In fact, its 32,000-node cluster is the still the largest in the world. Now the Web giant is souping up its massive investment in Hadoop t Read more…

Yahoo Casts Real-Time OLAP Queries with Druid

Yahoo is in the process of implementing a big data tool called Druid to power high-speed real-time queries against its massive Hadoop-based data lake. Engineers at the Web giant say the open source database's combination Read more…

Yahoo! Spinning Continuous Computing with YARN

YARN was the big news this week, with the announcement that the Hadoop resource manager is finally hitting the streets as part of the Hortonworks Data Platform (HDP) “Community Preview.” According to Bruno Fernandez-Ruiz, who spoke at Hadoop Summit this week, Yahoo! has been able to leverage YARN to transform the processing in their Hadoop cluster from simple, stodgy MapReduce, to a nimble micro-batch engine processing machine – a change which they refer to as “continuous computing.” Read more…

Baldeschwieler: Looking at the Future of Hadoop

Hadoop has come a long way, and with projects currently underway it’s got plenty of fuel to drive enterprise innovation for years to come said Hortonworks co-founder and CTO, Eric Baldeschwieler in his recent Hadoop Summit Keynote in Amsterdam, Netherlands. Read more…

Putting Some Real Time Sting into Hive

A coalition of Hive community enthusiasts report that they have achieved a 45x performance increase for Apache Hive through an effort they have branded “The Stinger Initiative.” The group says they are aiming at 100x improvement. Read more…

Apache Hadoop 2.0.3-Alpha Released With Future Outlook

The next generation of the Apache Hadoop open-source software framework has been given an alpha release and set free in the wild, delivering the next major milestone for the Apache Hadoop community. Read more…

Yahoo’s Genome Brings Data as a Service

Yahoo has lifted the lid on its BigQuery rival, Genome, which provides a data-as-a-service model for companies to comb through terabytes of Yahoo and partner network data in real time to look for trends for more targeted advertising and hopefully, other purposes as it.... Read more…

BigDATAwire