Hadoop
Cloudera, the Hadoop-centric big data company that IPO’d in 2017 and then went private again in a $5.3 billion deal in 2021, is now putting its emphasis on becoming the…
Learn how the changing business culture helps data-driven early stage startups at Data and the Culture Transformation
Change is afoot in the non-stop world of data collection and application, and if you’re a data-driven startup — or on your way to becoming one — TechCrunch and Cloudera…
At its Cloud Next event, Google today announced the launch of Spark on Google Cloud as a fully managed service. With this, the popular open source data processing engine will…
With buyout, Cloudera hunts for relevance in a changing market
When Cloudera announced its sale to a pair of private equity firms yesterday for $5.3 billion, along with a couple of acquisitions of its own, the company detailed a new…
Cloudera was once one of the hottest Hadoop startups, but over time the shine has come off that market, and today it went private as KKR and Clayton, Dubilier &…
Skymind Global Ventures launches $800M fund and London office to back AI startups
Skymind Global Ventures (SGV) appeared last year in Asia/UK as a vehicle for the previous founders of a YC-backed open-source AI platform to invest in companies that used the platform.…
Starburst raises $22M to modernize data analytics with Presto
Starburst, the company that’s looking to monetize the open-source Presto distributed query engine, today announced that it has raised a $22 million funding round led by Index Ventures, with the…
Datameer announces $40M investment as it pivots away from Hadoop roots
Datameer, the company that was born as a data prep startup on top of the open-source Hadoop project, announced a $40 million investment and a big pivot away from Hadoop,…
Databricks brings its Delta Lake project to the Linux Foundation
Databricks, the big data analytics service founded by the original developers of Apache Spark, today announced that it is bringing its Delta Lake open-source project for building data lakes to…
Google brings Cloud Dataproc to Kubernetes
Cloud Dataproc is probably one of the lesser-known products in Google Cloud’s portfolio, but it’s a powerful tool for data wranglers who are looking for a fully managed cloud service…
With MapR fire sale, Hadoop’s promise has fallen on hard times
If you go back about a decade, Hadoop was hot and getting hotter. It was a platform for processing big data, just as big data was emerging from the domain…
Qubole, the data platform founded by Apache Hive creator and former head of Facebook’s Data Infrastructure team Ashish Thusoo, today announced the launch of Quantum, its first serverless offering. Qubole…
Cloudera and Hortonworks, two of the biggest players in the Hadoop big data space, today announced that they have finalized their all-stock merger. The new company will use the Cloudera…
Over the years, Hadoop, the once high-flying open-source platform, gave rise to many companies and an ecosystem of vendors emerged. It was long believed that some major companies would emerge…
Investors place $25M on AtScale to get the big picture of big data
AtScale, a four-year old startup that helps companies get a big-picture view of their big data inside their BI tools, announced a $25 million Series C investment today. The round…
Databricks releases serverless platform for Apache Spark along with new library supporting deep learning
Today to kick off Spark Summit, Databricks announced a Serverless Platform for Apache Spark — welcome news for developers looking to reduce time spent on cluster management. The move to simplify developer…
TL;DR: Cloudera’s recent IPO filing shows a company with steep losses and rapid revenue growth. Today we’ll examine Cloudera’s finances and where it fits into the current IPO universe. Why…
When I first met Cloudera CEO Tom Reilly in 2015 at the Intel Capital Summit, we were about to go onstage for a fireside chat to discuss, among other things,…
Yahoo, model Apache Spark citizen and developer of CaffeOnSpark, which made it easier for developers building deep learning models in Caffe to scale with parallel processing, is open sourcing a new project…
MXNet, Amazon Web Services’ preferred deep learning framework, was accepted to the Apache Incubator today. Admission to the incubator is the first step necessary for the open-source initiative to officially become part…
Talend, the big data integration vendor that went public last July, announced its winter release today with new tools to help automate data preparation, a sticky problem for enterprise customers. Surely, there…
IBM releases DataWorks to give enterprise data a home and a brain
While the gears of research are turning fast developing new methods of machine intelligence, another, perhaps more impactful, trend is brewing in the field. Open source frameworks like Apache Spark are hitting their…
Latest Amazon Elastic MapReduce release supports 16 Hadoop projects
Amazon announced the release of Elastic MapReduce (EMR) 5.0.0 today, which includes, among other things, support for 16 open source Hadoop projects. As AWS continues to hone its various tools to…
Spark fragmentation undermines community
Today the Hadoop distribution war comes down to a final battle between Cloudera’s CDH and Hortonworks’ HDP. That wasn’t always the case. At the peak of the market’s fragmentation, numerous…
Microsoft bets on Apache Spark to power its big data and analytics services
Microsoft today announced that it is making a serious commitment to the open source Apache Spark cluster computing framework. After dipping its toes into the Spark ecosystem last year, the company…
Cray’s latest supercomputer runs OpenStack and open source big data tools
Cray has always been associated with speed and power and its latest computing beast called the Cray Urika-GX system has been designed specifically for big data workloads. What’s more, it runs…
Seven things to watch for at Strata + Hadoop World 2016 in San Jose
With Strata + Hadoop World kicking off, it’s always fascinating to step back and look at the contents of the sessions as a way of understanding what’s happening in the…
Newcomer Galactic Exchange can spin up a Hadoop cluster in five minutes
A new company with a cool name, Galactic Exchange, came out of stealth today with a great idea. It claims it can spin up a Hadoop cluster for you in five…
Altiscale’s latest cloud service brings Hadoop to business users
Altiscale, a company that has always been about reducing the complexity related to using Hadoop, has taken that to the next level today with the release of Altiscale Insight Cloud, a cloud…
LinkedIn open-sources its WhereHows data discovery and lineage portal
LinkedIn today open-sourced WhereHows, a meta data-centric tool the company has long used internally to make it easier for its employees to discover data the company generates and to track the…