“Cloud computing” is a term that’s thrown around a lot today, but it simply means accessing your data and applications without on-site infrastructure, i.e. in the cloud. Data processing, storage and backup, maintenance, administration and even troubleshooting are all taken care of by the service provider.
Some of us (like me) were skeptical when everything started being labeled as “cloud.” The thought of not having a trusted IT department maintain control of data and hardware was a little unsettling at first. But after considering the pros and cons of cloud computing (and also realizing that I use cloud services like Gmail and Salesforce.com every day without hesitation), the advantages became clear, even for business intelligence.
Implementing BI in the cloud is a dilemma for a lot of organizations we work with. They’re (rightly) concerned about data security, hardware failure, and anything that could take their reports offline, slow employees’ decision-making process, or expose valuable information to the wrong people. Those are all concerns that have been and continue to be addressed by cloud providers. Certainly data security and back-ups have become paramount to vendors offering cloud services. But as we hear less and less about massive cloud failures in the news and executives and IT managers get more comfortable with the cloud, we’ve seen a shift toward the cloud becoming acceptable for business intelligence deployments. Here’s why:
The cloud offers access to data, applications and other resources without the need for program installation. This equals major convenience when doing work on a portable device like a laptop, tablet PC or smartphone. Not only are your devices free from the clutter of numerous installs (which facilitates effective use of resources), but your company’s IT team isn’t bogged down with installing, reinstalling, and troubleshooting numerous devices for each employee. And since many of us work remotely occasionally, if not exclusively, a lightweight approach to accessing data is truly beneficial.
In my previous posts on the subject of cloud computing and business intelligence, I’ve discussed building a dimensional model to make reporting easy and I’ve identified the type of data extraction necessary to populate the reporting database. I’ve also decided that automating the extraction and load jobs is the way to go. What’s left is the choice of vendors who can help make this happen as smoothly as possible.
One solution is to go with a single vendor model for your business intelligence needs. If your transactional system is in the cloud, you would have to depend on the same vendor to provide you with the reporting database and the ETL jobs that fit your requirements. With BI in the cloud, the provider of the system has to create a dimensional model that meets most of their subscribers’ needs. However, this data model may not meet all the reporting requirements and the data is still not in-house for easy analysis. To fix this problem, the cloud vendor might have partners that can help make a custom BI system in the cloud more practical. For instance, Informatica offers an ETL solution to the Salesforce.com CRM to solve this very problem. Leaving BI in the cloud will also mean you’ll have to use the reporting tool picked for you by the cloud vendor. Additionally, you’ll have to assume most, if not all, of your reporting needs will be met with the data in the cloud and that you won’t ever need to enrich that information with locally gathered (i.e. non-cloud) data.
Since that kind of data uniformity is rare, a more practical choice would be to treat the cloud as just another data source. This will free you from a single vendor software stack and allow you to pick the best vendors for the job. In this model, the vendors you choose for the tasks should be able to:
In my last post on the subject of Cloud Computing, I mentioned two ways to slice and dice data in the cloud — depend on query tools to extract data to a local database, or use Data Warehouses to support the transactional system in the cloud. Today, I’ll delve deeper into these two choices for culling meaningful trends and KPIs from data in the cloud.
Whether or not a transactional system is moved to the cloud, the data collected is still necessary for analytical processing. A transaction processing system is optimized to capture the specific transactions as effectively as possible. On the other hand, analytical processing data has to be optimized to allow detection of trends in Key Performance Indicators. Business Intelligence (BI) systems are usually built on the latter. When the transaction system is in-house, an Extract-Transform and Load (ETL) system can be written to automate the transformation of data from highly normalized transactional to denormalized analytical form.
Business Intelligence applications are often based on a denormalized version of transactional data. This is done mainly to:
- keep analytical processing from slowing down the transaction systems
- create “reporting friendly” databases that lend themselves to analysis
Traditionally, both Transactional and Analytical databases reside on hardware inside the company’s firewall and when necessary, a BI report and/or chart can drill down from one system to another transparently.
With Cloud Computing, this model gets more complicated. The current trend of moving to the Software as a Service (SaaS) model is centered on transaction processing. For example, Salesforce.com is a transactional system that allows users to access a Customer Relationship Management system in a cloud. In the old days, because of the total cost of ownership, smaller organizations could ill afford to acquire these systems, and instead, resorted to maintaining their data in home grown and/or Excel-based databases. The SaaS model allows an organization of any size to access and benefit from very sophisticated systems through subscribing to them on a named user basis. Therefore, whether an organization has 10 or 1,000 sales reps, it can maintain a robust set of metrics at a very reasonable cost.