In my previous posts on the subject of cloud computing and business intelligence, I've discussed building a dimensional model to make reporting easy and I've identified the type of data extraction necessary to populate the reporting database. I've also decided that automating the extraction and load jobs is the way to go. What's left is the choice of vendors who can help make this happen as smoothly as possible.
One solution is to go with a single vendor model for your business intelligence needs. If your transactional system is in the cloud, you would have to depend on the same vendor to provide you with the reporting database and the ETL jobs that fit your requirements. With BI in the cloud, the provider of the system has to create a dimensional model that meets most of their subscribers' needs. However, this data model may not meet all the reporting requirements and the data is still not in-house for easy analysis. To fix this problem, the cloud vendor might have partners that can help make a custom BI system in the cloud more practical. For instance, Informatica offers an ETL solution to the Salesforce.com CRM to solve this very problem. Leaving BI in the cloud will also mean you'll have to use the reporting tool picked for you by the cloud vendor. Additionally, you'll have to assume most, if not all, of your reporting needs will be met with the data in the cloud and that you won't ever need to enrich that information with locally gathered (i.e. non-cloud) data.
Since that kind of data uniformity is rare, a more practical choice would be to treat the cloud as just another data source. This will free you from a single vendor software stack and allow you to pick the best vendors for the job. In this model, the vendors you choose for the tasks should be able to:
- Communicate with cloud- and non-cloud-based data through an open API;
- Have built-in scripting and automation-friendly functionality;
- Be data agnostic in order to be able to combine multiple data sources;
- Provide a single elegant design tool to allow minimal coding
Does 'best of breed' mean all these functionalities have to be decoupled? Not necessarily. Of course, one can get the best ETL tool and the best dashboarding product and write the code to tie them all to the best databases, but there is another (and less expensive) way: pick a BI vendor that understands the above requirements and provides the plumbing in addition to a strong analytic product that can handle the heavy lifting of a BI system.
What I'm suggesting is to look at the BI toolset and make sure it can access data not only via the traditional adapters such as ODBC, SAP, and Essbase, but also produce and consume Web Services through a published API (#1). Make sure the BI product you consider allows for automation and triggering of scripted jobs. This enables it to invoke batch jobs necessary to extract the data from any source (including the cloud) and hand off the answer sets to the load procedures that populate the reporting databases regardless of the data source (#2 and #3). Enable the designers of these systems to stay within the same GUI toolset as much as possible to minimize the learning curve and maximize uniformity (#4).
We've all seen fancy dashboard demos and heard plenty of BI vendors advocate end-user empowerment through ad-hoc reporting. While it's important for BI tools to have a pretty front-end, the discussion above highlights the importance of the plumbing involved in a robust end-to-end product and the role BI vendors play in empowering their customers to create reporting systems that make all data accessible regardless of their location and type. Cloud computing doesn't necessarily mean expensive or daunting reporting issues down the road, as long as the BI infrastructure is based on open standards and built-in features that simplify the extract and load processes.
Have a question about BI in the cloud? Leave me a comment!