Follow BigDATAwire:

September 12, 2024

Legacy Data Architectures Holding GenAI Back, WEKA Report Finds

(Gennady-Grechishkin/Shutterstock)

While large language models (LLMs) have kickstarted an exciting new phase in AI, companies are not able to satisfy their GenAI goals due to several factors, with poor quality data and legacy data architectures chief among them, a new report from WEKA says.

The “2024 Global Trends in AI” report found that 88% of organizations are investigating GenAI technology, which echoes the widespread interest in GenAI found in other surveys. The report, which WEKA commissioned S&P Global Market Intelligence to put together, found 24% of organizations have GenAI applications actively deployed, which is also in line with data from other surveys.

The adoption of GenAI technology “is exploding” and the deployments of GenAI applications is spreading fast, Weka found, adding that it detected a “radical shift” from 2023 in the maturity levels of AI projects. A majority of the 1,500 global AI decision makers surveyed by S&P Global Market Intelligence indicate that AI is “currently widely implemented” and “driving critical value” for their organizations.

Where the positive narrative gets tripped up, however, is with scaling GenAI deployments. “The average organization has 10 projects in the pilot phase and 16 in limited deployment,” WEKA says in the report, “but only six deployed at scale.”

Data quality is the top impediment to AI success (Source: WEKA Global Trends in AI 2024)

WEKA identified several reasons for this situation. GPU availability is still constrained, for starters, and customers are concerned about the environment footprints of AI infrastructure. Ensuring data privacy is another factor. But the biggest impediment to the full rollout of GenAI, WEKA says, is a lack of high-quality data.

“The challenge for project teams is not so much about identifying relevant data, but its availability,” WEKA says in its report. “Organizations are struggling to build a consistent, integrated data foundation for projects.”

Survey respondents identified the lack of modern data architectures as a big reason for the GenAI shortfall. More than one-third (35%) said storage and data management were the primary infrastructure issues hindering AI deployments, which exceeds concerns about compute (26%), security (23%) and networking (15%).

The data quality challenge is not due to a lack of data to build performant models, WEKA says, but due to the data not being set up in a way that teams can take full advantage of it. The quality of data and privacy concerns around the data were bigger concerns than the availability of data, it says.

Issues with data management and storage are impacting AI project lifecycles by making it more difficult for organizations to prepare data for training and deployment, WEKA says. Specifically, the data preprocessing stage is an area of big concern for organizations taking WEKA’s survey.

Legacy data management and storage practices are holding back AI, WEKA says 

What’s more, the data preprocessing situation has not improved over the past 12 months, which doesn’t bode well for future AI work, WEKA says. “Bringing AI projects live but limiting their value or extensibility with weak data foundations sets a poor precedent for the next wave of initiatives in the early stages of exploration,” it says in the report.

The company quotes anonymous IT leaders about the state of their data estates and how it’s impacting their AI work.

A CIO at a midsize American company in the trucking and warehousing space said his or her company still has challenges with master data management. “Branches had different SKUs for inventory; if I take that siloed data and put it into a model, we’ll get the wrong results. Cleaning up this data is our focus,” the CIO wrote.

Another CIO at a midsize food and beverage manufacturing company in the UK said that the first thing he or she did was “double down on data strategy, effectively building a data platform and governance capabilities around that,” according to the report. That helped the organization avoid the fate of other companies that have tried to bolt data management and governance on top of disparate data estates obtained through acquisition, the CIO wrote.

Organizations that have invested in data management and storage are more likely to have better outcomes with GenAI, the WEKA report says. “By building a solid data foundation at the outset, AI leaders have ensured that valuable pilots have a clear path to deliver at scale,” it says.

AI deployments are growing (Source: WEKA Global AI Trends 2024)

For instance, just 28% of respondents at organizations with wide AI implementations say storage and data management
challenges are their greatest inhibitors, compared to 42% of respondents with more limited AI implementations who say storage and data management are top issues. The former group says getting access to compute and networking resources are a great impediment than data management and storage.

That suggests they have already invested in addressing those concerns, WEKA says. “Organizations that are delivering AI at scale
appear to have focused on investing in upgrading the systems and technologies used to store or manage data,” it says.

There are a lot of factors that go into succeeding with GenAI. But considering that, at the end of the day, AI is a data-driven exercise, it makes sense that having one’s data house in order increases the odds of a good experience with AI.

“Organizations must establish a clear pathway for scaling AI projects into production, ensuring efficient data management and storage,” WEKA says. “It is crucial to invest in a strong data foundation before committing to high volumes of pilot projects. This will help enable seamless AI value delivery.”

You can download WEKA’s report here.

Related Items:

GenAI Adoption By the Numbers

Getting Value Out of GenAI

Is the GenAI Bubble Finally Popping?

BigDATAwire