Tag: data pipeline
Building Continuous Data Observability at the Infrastructure Layer
Data is the lifeblood of business today, but getting it where it needs to go is hard, especially as data volumes grow. Data pipelines become the repeatable method for moving this digital crude, but monitoring the flows f Read more…
Monte Carlo Hits the Circuit Breaker on Bad Data
Data pipelines are critical conduits of information for data-driven companies. But what happens when the data in the pipeline becomes corrupted? In some situations, you want to immediately stop the flow of data, which is Read more…
Databricks Ships New ETL Data Pipeline Solution
Databricks today announced the general availability (GA) of Delta Live Tables (DLT), a new offering designed to simplify the building and maintenance of data pipelines for extract, transform, and load (ETL) processes usi Read more…
ETL Tool Apache Hop Graduates Incubator
Apache Hop, a metadata-driven data orchestration tool used to design and build pipelines, today emerged from incubator status and was named a Top-Level Project at the Apache Software Foundation, clearing the way for more Read more…
Inside AutoTrader UK’s Data Observability Pipeline
In the course of shifting its analytics estate to the cloud, AutoTrader UK has adopted many new tools and technologies, including BigQuery, Looker, and dbt, which have helped to democratize data access among users. Along Read more…
Bad Data Pipelines Costing Companies Big, Fivetran Finds
Stop us if you’ve heard this one before: Overworked data engineer builds faulty data pipeline, which leads to bad data, which leads to a bad outcome. It may be the same old song, but it’s also the current state of af Read more…
Hands-Off: Manual Data Integration Tasks Plummeting, Gartner Says
While the need to integrate data has never been greater, the addition of machine learning and other forms of automation is driving a large reduction in the amount of manual data management tasks that human workers are re Read more…
The Data Mesh Emerges In Pursuit of Data Harmony
The data mesh is a new concept that’s emerging in big data circles. Similar in some respects to data fabrics, the data mesh provides a way to reconcile and hopefully overcome the challenges posed by previous data archi Read more…
Data Pros Are Maxed Out: Survey
The demands of the data-driven life are wearing on data professionals, according to survey results unveiled this week by data pipeline automation solution developer Ascend.io, which found that 96% of data professionals a Read more…
50 Years Of ETL: Can SQL For ETL Be Replaced?
It’s hard to imagine data warehousing without ETL (extract, transformation, and load). For decades, analysts and engineers have embraced no-code ETL solutions for increased maintainability. Does this mean that Struct Read more…
Future Proofing Data Pipelines
Data pipelines are critical structures for moving data from its source to a destination. For decades, companies have used data pipelines to move information, such as from transactional with analytic systems. However, as Read more…
Prophecy Spins Up Low-Code Data Pipeline Tool
In recent years, the shortage of data engineers has at times exceeded the shortage of data scientists. To help close the gap, a Silicon Valley startup called Prophecy today unveiled a low-code data engineering tool that Read more…
What Is a Data Cloud? And 11 Other Snowflake Enhancements
Snowflake has changed how the industry thinks about data warehouses with its cloud-native offering, which has been adopted by 4,000 organizations, including 2,000 in the last year alone. Now the company is taking the con Read more…
Data Pipelines of a Higher Order
Data pipelines are being constructed everywhere these days, moving huge swaths of data for a wide variety of operational and analytical needs. There’s no doubt that all these data pipelines are keeping data engineers b Read more…
StreamSets Eases Spark-ETL Pipeline Development
Apache Spark gives developers a powerful tool for creating data pipelines for ETL workflows, but the framework is complex and can be difficult to troubleshoot. StreamSets is aiming to simplify Spark pipeline development Read more…
Databricks Donates Delta Code to Open Source
Databricks today announced that it's open sourcing the code behind Databricks Delta, the Apache Spark-based product it designed to help keep data neat and clean as it flows from sources into its cloud-based analytics env Read more…
Data Pipeline Automation: The Next Step Forward in DataOps
The industry has largely settled on the notion of a data pipeline as a means of encapsulating the engineering work that goes into collecting, transforming, and preparing data for downstream advanced analytics and machine Read more…
From Big Beer to Big Data: Inside AB InBev’s Digital Transformation
With more than 500 beer brands and $55 billion in sales, Anheuser-Busch InBev is already the world's biggest beer company. And if all goes as planned with its digital transformation project, it will be the best beer comp Read more…
Google Doubles Down on Cloud Data Migration
Data integration startups have become prime acquisition targets as cloud analytics vendors look to beef up their migration capabilities. What that in mind, Google Cloud announced this week it intends to acquire data m Read more…
Streamsets Gets $35M for DataOps
StreamSets, which bills itself as the "air traffic control" tasked with preventing collisions from occurring with big data, today announced that it raised $35 million, which it will use to continue building its data oper Read more…