AtScale Launches Public Leaderboard for Evaluating Text-to-SQL Solutions
As the demand for natural language data queries continues to grow, so does the need for a standardized way to evaluate Text-to-SQL (T2SQL) solutions.
Despite rapid advancements in T2SQL technologies, the industry has struggled with inconsistent benchmarks. This lack of uniform standards has made it challenging for stakeholders to accurately assess and compare solution performance.
AtScale, a semantic layer platform, has announced an open, public leaderboard for TS2QL solutions, meeting the critical need for a standardized and transparent evaluation of natural language query (NLQ) capabilities.
The launch of AtScale’s Text-to-SQL leaderboard comes at a time when the industry is experiencing a surge in T2SQL solutions, driven by advancements in GenAI. These improvements have made it easier for users to interact with databases using natural language. However, there is a lack of tools that can effectively evaluate and compare the performance of T2SQL solutions in handling various queries.
AtScale claims that the Text-to-SQL leaderboard offers developers, vendors, researchers, and other stakeholders a reliable tool to measure and compare T2SQL performance. The leaderboard is based on an industry-standard dataset, schema, and evaluation methods.
“AtScale’s leaderboard sets a new standard for transparency in Text-to-SQL evaluation,” said John Langton, Head of Engineering at AtScale. “By creating an open, objective framework, we’re enabling the industry to validate and improve solutions that make natural language data queries more accessible and reliable for everyone.”
A key feature of the Text-to-SQL leaderboard is its open benchmarking environment, which makes the benchmarking process transparent and reproducible.
AtScale has also provided a public GitHub repository that contains all the necessary resources for evaluating T2SQL systems, including a TPC-DS dataset, KPI definitions, evaluation questions, and scoring methods.
Additionally, Text-to-SQL leaderboard features offer evaluation metrics that consider question and schema complexity. These metrics offer a clearer assessment of performance by taking into account the complexity of both the questions and the database structures.
Users also get access to a real-time performance tracker, which AtScale claims is an industry-first. This feature displays the scores of T2SQL solutions, showcasing each model’s current standing to encourage developers to improve their solutions through healthy competition.
The leaderboard also promotes community collaboration by serving as a shared resource that welcomes feedback, insights, and collective efforts to improve T2SQL evaluations.
A core theme of the leaderboard tool is to promote transparency. Unlike many vendors that claim high accuracy without sharing their data or evaluation methods, AtScale’s open-sourced benchmark and Text-to-SQL leaderboard provides a standardized and transparent framework.
Explaining the challenges of comparing Text-to-SQL solutions, AtScale shared in a blog post, “Vendors often publish results for Text-to-SQL systems without disclosing the data, schema, questions, or evaluation criteria used. While 90% accuracy sounds impressive, it is impossible to validate without this information.”
“Furthermore, it is not possible to compare one system to another without using the same inputs and evaluation criteria. To address this issue, we attempted to create an objective, quantitative method for evaluating and comparing Text-to-SQL systems.”
The launch of the leaderboard aligns perfectly with AtScale’s broader offerings. The company’s semantic layer platform simplifies data access and ensures consistency across various data sources. This expertise directly supports T2SQL solutions, as the semantic layer helps connect complex data with the natural language queries that T2SQL tools are designed to process.
Earlier this year, AtScale announced a major upgrade to its platform with the introduction of a Universal Semantic Hub. The addition of the Text-to-SQL leaderboard brings AtScale closer to its goal of improving how organizations interact with and leverage data across various tools and stakeholders.
The AtScale team shared that they plan on continuously improving this benchmark and making it “a robust source of truth for Text-to-SQL solutions”. The company also shared that as its T2SQL solutions mature, it will post its new results to this same leaderboard.
Related Items
AtScale Claims Text-to-SQL Breakthrough with Semantic Layer
Gretel Open Sources 100,000 Text-to-SQL Samples
Chat With Your Data: Mixpanel Integrates Generative AI to Simplify Analytics