Can Open Source bring Business Intelligence for the Masses?





Can Open Source bring Business Intelligence for the Masses? Date: Saturday , April 01, 2006 Business Intelligence (BI) software products, open source or not, mainly focus on the developers and system integrators building BI applications, and not much on the end users. The end user’s problem is pretty straightforward: “I am sitting on a mountain of data accumulated into various data silos in my organization, and I need help making sense out of it; I need a tool that helps me derive actionable insights.” End users don’t want to write code or SQL and MDX queries. They don’t want to mess with complex regression models or machine-learning data mining models. They want a tool that has some notion of the business questions they ask, and helps them analyze their data in the context of those questions—hence the term business intelligence. The “I” in BI should be about intelligence, not infrastructure. Most of current BI technology focuses on the “infrastructure” components—database, OLAP, workflow, data mining, and reporting. But just by having these components in one place, you don’t become intelligent. You still need data models optimized for your specific types of analyses, you still need to find the appropriate data mining and statistical models for your problem space, you still need the right visualizations to interactively analyze and publish your data in a way that mere mortals can understand it. This is the real “I” in BI. Open source BI software highlights this difference interestingly because open source by its very nature commoditizes the infrastructure of BI, pushing the proverbial “value up the stack” to components that actually provide intelligence. Lately, there has been a lot of activity in the open source BI space—JasperReports, Pentaho, and BIRT are few of the popular projects. However, what is critical is to understand how all this activity relates to the common end users of BI. End users want to use BI software to analyze different data assets spread across their organization, and easily share the results of their analysis to facilitate better decision-making. To connect this need with all the activity in the open source BI, one still needs an application development team to put together all the necessary components into a coherent BI application. Now, this is a much better scenario than a few years ago, where you needed to purchase an expensive and proprietary enterprise software product just to get this process started. In that sense, open source is reducing the cost of application development, hence making BI more accessible to the masses. However, given how fragmented the open source BI components are, cost of putting together all these components can still be significant. Add to that the need of understanding business requirements and the applicability of various models and algorithms to specific business questions, and this can still be a daunting task. If we look at the typical BI software infrastructure, it usually boils down to four key technologies: 1) Database and ETL technology to load and build data marts. 2) Data mining/statistics technology to discover patterns in data. 3) OLAP technology to slice/dice and drill up and down through large datasets. 4)Data visualization technology to interact with each of the abovementioned technologies and easily publish and share the analyses. Most of the current open source BI projects provide strong support for one or more of these technologies, but not all. Their main users are application developers who integrate the various components and Software Development Kit’s to build their respective BI applications. But what if you could bypass this whole effort and have just one application that enabled easy integration of all these components? What if analysis and publishing was possible without having to write any code? Then the process of analysis becomes significantly more accessible, which is the single most important factor for the adoption of any BI software. For instance, our open source project OpenI is essentially a data visualization technology, but to address usability, we made sure that with OpenI, end users can point at any OLAP cubes, RDMBS tables, or data mining models and start analyzing and publishing their results right away, all done over the web without writing any code or queries. With this approach, we see a much higher end user adoption rate than approaches that require end users to work with an IT team to develop their reports and analysis. It is simple empowerment, but very powerful nevertheless. If there is one BI software trend to look out for, it is usability. Not just for BI, but open source projects in general have had a bad reputation in terms of not being easily installable, requiring too much configuration and integration, and above all, being focused too much on the builders, not the actual end users. To bring BI to the masses, this must change. What’s promising is that open source doesn’t work like enterprise software where one company needs to own the entire space. Successful open source is all about building a community that spreads far beyond. We are fortunate to have several noteworthy open source BI projects. It is time for these projects to collaborate and make open source BI more accessible, and start growing the community. That’s how open source BI is going to become real for the masses. Sandeep Giri is Project Lead, openi.org and CEO, Loyalty Matrix, Inc. He can be contacted at sandeep@openi.org