Tag: Data engineering
Overcoming the Financial Implications of Poor Data Quality
Approaches to data quality vary from company to company. Some organisations put a lot of effort into curating their data sets, ensuring there are validation rules and proper descriptions next to each attribute. Others co Read more…
Denodo Brings Data Management to Academia, the Cloud
Data management is hard, and it’s not getting easier, thanks to the continued tsunami of big data (a data-nami, as it were). One company that’s aiming to help with the continued digital onslaught is data virtualizati Read more…
Astronomer’s High Hopes for New DataOps Platform
Astronomer last month rolled out a new observability product called Astro Observe that’s aimed at giving customers the full picture of how their data is flowing using Apache Airflow, the open source data orchestration Read more…
Databricks Unveils LakeFlow: A Unified and Intelligent Tool for Data Engineering
Data engineering is a cornerstone for the democratization of data and AI. However, it faces significant challenges in the form of complex and brittle connectors, difficulty in integrating data from disparate and often pr Read more…
Operational Data Warehouse: Power Real-Time Business Processes
Have you ever tried driving real-time updates to customer-facing applications? Or optimized a conversion funnel using streaming data? If so, you’ve probably suffered the pain of pushing your data warehouse past its l Read more…
Data Observability in 2024: A Guide
In today's data-driven world, data observability is a critical concept for organizations aiming to effectively manage their data. Simply put, it means having the ability to constantly monitor and understand the status of Read more…
Data Engineering in 2024: Predictions For Data Lakes and The Serving Layer
The data landscape experienced significant changes in 2023, presenting new opportunities (and potential challenges) for data engineering teams. I believe we will see the following this year in the areas of analytics, Read more…
How Airflow 2.8 Makes Building and Running Data Pipelines Easier
Apache Airflow is one of the world’s most popular open source tools for building and managing data pipelines, with around 16 million downloads per month. Those users will see several compelling new features that help t Read more…
Mastering Data Modeling: Insights from a Data Product Developer
Let's not deny it; we've all been captivated by the elegant symmetry of Data Products. If you haven't encountered them yet, you might be living under a rock. But don't worry, before we delve into advanced solutions, let' Read more…
Snowflake Gives Everybody a Little Something at Summit
Whether you’re a data engineer building data pipelines, a data scientist creating AI models, or a CFO trying to minimize cloud spending, Snowflake gave you something today at its annual user conference in Las Vegas, Ne Read more…
Numbers Station Sees Big Potential In Using Foundation Models for Data Wrangling
A startup called Numbers Station is applying the generative power of pre-trained foundation models such as GPT-4 to help with data wrangling. The company, which is based on research conducted at the Stanford AI Lab, has Read more…
Meet Maxime Beauchemin, a 2023 Person to Watch
When it comes to prolific contributors to open source projects in the big data space, Maxime Beauchemin is definitely somebody you should know. As a data engineer at Airbnb, Beauchemin created multiple tools that he subs Read more…
Top 12 Datanami Stories of 2022
With another year almost behind us, it’s time to sit back and consider what we’ve just been through. It’s been another active 12 months in the big data space, with plenty of news for the intrepid big data reader. Read more…
Dataiku 11.1 Update Boosts Data Science and MLOps
Dataiku has unveiled the latest update to its data science and machine learning platform, Dataiku 11.1. This update includes improvements to existing capabilities as well as new features for data scientists, ML engineers Read more…
What’s Holding Up Progress in Machine Learning and AI? It’s the Data, Stupid
The lack of a solid data foundation and solid data workflows is preventing companies from making more progress with machine learning and AI, according to a new Forrester Consulting survey conducted on behalf of Capital O Read more…
No-code Analytics System Redbird Nabs $7.6M Seed
Redbird, a New York-based enterprise analytics operating system, announced it has raised $7.6 million in an oversubscribed seed round. The Redbird platform allows non-technical users to automate and unify analytics wo Read more…
Snowflake Reflects on 10 Years Passed, Ponders 10 Years Ahead
When Snowflake Computing was founded 10 years ago, the big data market looked much different than it does today. Momentum was building behind something called Hadoop, while cloud computing was viewed with suspicion. Desp Read more…
Why DataOps-Centered Engineering is the Future of Data
DataOps will soon become integral to data engineering, influencing the future of data. Many organizations today still struggle to harness data and analytics to gain actionable insights. By centering DataOps in their proc Read more…
Data Automation Poised to Explode in Popularity, Ascend.io Says
The amount of data automation deployed in the wild is quite small at the moment, but it’s set to grow significantly in the months to come as overworked data teams seek respites from grueling manual data tasks. That’s Read more…
The Modernization of Data Engineering at Capital One
Like many enterprises, Capital One Financial Corp. is in the process of democratizing employee access to data to improve profitability, lower risk, and increase customer satisfaction. But lowering the barriers to data ac Read more…