Hadoop

Cloudera, the Hadoop-centric big data company that IPO’d in 2017 and then went private again in a $5.3 billion deal in 2021, is now putting its emphasis on becoming the…

Cloudera launches its all-in-one SaaS data lakehouse

Change is afoot in the non-stop world of data collection and application, and if you’re a data-driven startup — or on your way to becoming one — TechCrunch and Cloudera…

Learn how the changing business culture helps data-driven early stage startups at Data and the Culture Transformation

At its Cloud Next event, Google today announced the launch of Spark on Google Cloud as a fully managed service. With this, the popular open source data processing engine will…

Google Cloud launches a managed Spark service

When Cloudera announced its sale to a pair of private equity firms yesterday for $5.3 billion, along with a couple of acquisitions of its own, the company detailed a new…

With buyout, Cloudera hunts for relevance in a changing market

Cloudera was once one of the hottest Hadoop startups, but over time the shine has come off that market, and today it went private as KKR and Clayton, Dubilier &…

Cloudera to go private as KKR & CD&R grab it for $5.3B

Skymind Global Ventures (SGV) appeared last year in Asia/UK as a vehicle for the previous founders of a YC-backed open-source AI platform to invest in companies that used the platform.…

Skymind Global Ventures launches $800M fund and London office to back AI startups

Starburst, the company that’s looking to monetize the open-source Presto distributed query engine, today announced that it has raised a $22 million funding round led by Index Ventures, with the…

Starburst raises $22M to modernize data analytics with Presto

Datameer, the company that was born as a data prep startup on top of the open-source Hadoop project, announced a $40 million investment and a big pivot away from Hadoop,…

Datameer announces $40M investment as it pivots away from Hadoop roots

Databricks, the big data analytics service founded by the original developers of Apache Spark, today announced that it is bringing its Delta Lake open-source project for building data lakes to…

Databricks brings its Delta Lake project to the Linux Foundation

Cloud Dataproc is probably one of the lesser-known products in Google Cloud’s portfolio, but it’s a powerful tool for data wranglers who are looking for a fully managed cloud service…

Google brings Cloud Dataproc to Kubernetes

If you go back about a decade, Hadoop was hot and getting hotter. It was a platform for processing big data, just as big data was emerging from the domain…

With MapR fire sale, Hadoop’s promise has fallen on hard times

Qubole, the data platform founded by Apache Hive creator and former head of Facebook’s Data Infrastructure team Ashish Thusoo, today announced the launch of Quantum, its first serverless offering. Qubole…

Qubole launches Quantum, its serverless database engine

Cloudera and Hortonworks, two of the biggest players in the Hadoop big data space, today announced that they have finalized their all-stock merger. The new company will use the Cloudera…

Cloudera and Hortonworks finalize their merger

Over the years, Hadoop, the once high-flying open-source platform, gave rise to many companies and an ecosystem of vendors emerged. It was long believed that some major companies would emerge…

Cloudera and Hortonworks announce $5.2 billion merger

AtScale, a four-year old startup that helps companies get a big-picture view of their big data inside their BI tools, announced a $25 million Series C investment today. The round…

Investors place $25M on AtScale to get the big picture of big data

Today to kick off Spark Summit, Databricks announced a Serverless Platform for Apache Spark — welcome news for developers looking to reduce time spent on cluster management. The move to simplify developer…

Databricks releases serverless platform for Apache Spark along with new library supporting deep learning

TL;DR: Cloudera’s recent IPO filing shows a company with steep losses and rapid revenue growth. Today we’ll examine Cloudera’s finances and where it fits into the current IPO universe. Why…

Cloudera’s IPO will test unicorn valuations

When I first met Cloudera CEO Tom Reilly in 2015 at the Intel Capital Summit, we were about to go onstage for a fireside chat to discuss, among other things,…

Cloudera finally ready for the public stage

Yahoo, model Apache Spark citizen and developer of CaffeOnSpark, which made it easier for developers building deep learning models in Caffe to scale with parallel processing, is open sourcing a new project…

Yahoo supercharges TensorFlow with Apache Spark

MXNet, Amazon Web Services’ preferred deep learning framework, was accepted to the Apache Incubator today. Admission to the incubator is the first step necessary for the open-source initiative to officially become part…

MXNet accepted to the Apache Incubator

Talend, the big data integration vendor that went public last July, announced its winter release today with new tools to help automate data preparation, a sticky problem for enterprise customers. Surely, there…

Talend looks to ease big data prep with latest release

While the gears of research are turning fast developing new methods of machine intelligence, another, perhaps more impactful, trend is brewing in the field. Open source frameworks like Apache Spark are hitting their…

IBM releases DataWorks to give enterprise data a home and a brain

Amazon announced the release of Elastic MapReduce (EMR) 5.0.0 today, which includes, among other things, support for 16 open source Hadoop projects. As AWS continues to hone its various tools to…

Latest Amazon Elastic MapReduce release supports 16 Hadoop projects

Today the Hadoop distribution war comes down to a final battle between Cloudera’s CDH and Hortonworks’ HDP. That wasn’t always the case. At the peak of the market’s fragmentation, numerous…

Spark fragmentation undermines community

Microsoft today announced that it is making a serious commitment to the open source Apache Spark cluster computing framework. After dipping its toes into the Spark ecosystem last year, the company…

Microsoft bets on Apache Spark to power its big data and analytics services

Cray has always been associated with speed and power and its latest computing beast called the Cray Urika-GX system has been designed specifically for big data workloads. What’s more, it runs…

Cray’s latest supercomputer runs OpenStack and open source big data tools

With Strata + Hadoop World kicking off, it’s always fascinating to step back and look at the contents of the sessions as a way of understanding what’s happening in the…

Seven things to watch for at Strata + Hadoop World 2016 in San Jose

A new company with a cool name, Galactic Exchange, came out of stealth today with a great idea. It claims it can spin up a Hadoop cluster for you in five…

Newcomer Galactic Exchange can spin up a Hadoop cluster in five minutes

Altiscale, a company that has always been about reducing the complexity related to using Hadoop, has taken that to the next level today with the release of Altiscale Insight Cloud, a cloud…

Altiscale’s latest cloud service brings Hadoop to business users

LinkedIn today open-sourced WhereHows, a meta data-centric tool the company has long used internally to make it easier for its employees to discover data the company generates and to track the…

LinkedIn open-sources its WhereHows data discovery and lineage portal