Birst Kicks Data Scientists to the Curb
If you ask business intelligence vendor Birst’s CEO, Brad Peters, about the role of data scientists in many enterprise contexts, there are just a few of them out there. Seven, to be exact, he says facetiously.
He contends that when it comes to advanced business analytics, many users have need for in-depth analytical talent, but these needs can be overcome with a platform that emphasizes ease of use so that the non IT can run and manage reporting.
Peters noted that his company is focused on making that elusive data scientist unnecessary. “The problem with big data scientists is that there are seven of them. Most business users can’t do complex SQL or all that sort of stuff.” Now, clearly there exist more than seven data scientists in the world. But there is certainly a relative shortage of people who can translate cloud data to non-cloud business environments.
When putting together a report, it is difficult for business analysts to augment their own data with data that may be hosted by the company’s cloud service. Birst hopes to alleviate this problem with their Distributed Business Analytics platform, currently in beta testing. We caught up this week with Birst CEO Brad Peters to discuss the new system’s objective, method, and tentative results.
The San Francisco-based company, which was founded in 2004, has had a traditional emphasis on removing hardware and management complexities for these ease of use seekers by delivering its platform in the cloud, however the latest release of version 5 offers a software virtual appliance option. They have been beating the big data drum this year in particular, adding both in-memory capabilities that go a bit farther than their traditional database approach in addition to their connector for big data workloads.
Ease of use and cloud/non-cloud compatibility were the themes during our conversation with the CEO. Peters frequently brought up the plight of the non-IT end-user, discussing how difficult it was for someone who was not familiar with SQL to access data that was in the cloud. Birst is designed to sit atop the existing BI cloud infrastructure and directly connect the cloud to the users.
Distributed Business Analytics, which Peters says started its development six years ago, will provide ‘virtual sandboxes’ in which business analysts will, from their desktop, be able to mesh cloud data and non-cloud data, a task usually reserved for highly talented and trained data science professionals.
What makes the relationship between cloud and non-cloud so tenuous is the fact that cloud data can be written and stored in any variety of languages, and that there can almost be no control over the data. Further, the cloud is of course ever-expanding as anyone can put all sorts of datasets in there. Mucking about that vast data warehouse has forced many to consider the much cleaner prospect of in-memory processing.
However, Peters does not feel the cloud should be disregarded and believes that Birst has solved this problem. The key, he says, is not in extracting or copying the data but in the ability to read the data and sift through it. According to Peters, Birst will be able to operate in any programming language, essentially being able to read and ask questions of other databases.
“Distributed Analytics,” says Peters. “is able to directly connect to those existing databases and query them directly. It has the ability to generate SQL queries, relational database queries and use their language to the fullest to do the calculation.” While Peters notes that the ability to do this is powerful, it is also something that he admits most vendors are capable of.
What separates Birst from other vendors is the ability to input data into those databases from the end user and the ability to run calculations in the databases that are in the cloud. Again, this is achieved through recognizing and operating in the languages that non-company cloud databases are using.
Ultimately, and this is the most important aspect that Peters and Birst want to stress, that would allow for the data to be used without extracting or copying it. Peters stresses that point over and over again, as do the slides that accompany Birst’s press release. But for good reason, one of the advantages of the cloud was supposed to be to give everyone access to a vast amount of data. But that exercise becomes pointless if the data has to be copied and stored within one’s own data warehouse before it can be used.
Peters claims that while most vendors can read all the data in the cloud, no other vendor can use it without extracting or copying it and no other vendor can allow business users to add to the non-company databases without going through great pains to do so. He may be right and if he is, that may well be a big success for Birst. But it is still unclear whether or not Birst can do what it claims. As Peters admits, the technology is still in beta testing and has yet to be released.
The beta tests, according to Peters, are happening as such: a major piece of corporate data (finance, for example) is accessible to about 50 business end-users. The end-users then play with that data alongside their own data in the sandbox, created in perception by Birst collecting cloud database and setting it in the same environment as non-cloud data, and then distributing the data to their fellow analysts.
If these tests prove fruitful, Birst will be looking to release the product to the larger market later in the year. Peters, unsurprisingly, believes his company’s product will change the world of Business Intelligence. Analysts Michael Lock, Peter Ostrow, and David White of Aberdeen Group have lauded Birst’s efforts so far. So too has Hyoun Park of Nucleus Research.
While it remains to be seen if Birst and its Distributed Business Analytics can do all the things it wants to do, it would at least be a minor victory if they can accomplish something no other vendor can.
Related Stories
Six Super-Scale Hadoop Deployments
Another Brick in the Hadoop Wall
Partnership Targets BI Scalability