Augmenting the Enterprise Data Warehouse: The Pros and Cons of OLAP Cubes
IT leaders in the enterprise are juggling two conflicting demands around their organization’s data architecture: the need for stability and security of business critical data, traditionally stored in the data warehouse and delivered via cubes, and the demand to support new data sources coming into the business from emerging technologies or third-party applications. How can these two different but equal demands be balanced?
The data warehouse is well established and business users rely on that data. The IT leader’s responsibility is to ensure that the data is available, secure, and meets compliance requirements. But as more and more business processes move online and into the cloud, the importance of external data sources outside of the data warehouse is increasing for functions like marketing, sales, operations, HR, and finance.
The growing range of third-party data sources don’t conform as easily to the data warehouse model as internal business data; they can be unstructured, are delivered in very high volumes, and their composition can change often based on the context in which they are needed. The services themselves evolve and add features, which adds to the complexity.
Whether it’s due to a lack of IT resources to bring these feeds into the data warehouse, or technical issues due to their format and structure, much of this important data cannot be integrated and becomes “dark data”—data which is not managed or stored within the approved data architecture and instead is handled directly by departments using offline files or direct downloads.
The risks of this are clear
It is not just that this potentially business-critical dark data is outside of the management scope of IT, but also that to work with dark data, business users need to take data out of the warehouse and into offline systems to bring the sources together—which loses the certification and accuracy of that data. For instance, spreadsheets are constantly exported for custom reporting and insights.
IT leaders are left with two challenges to solve: how to bring this dark data into the light so it is available for decision making, and how to do this in a way that complements existing warehouse architecture and policies. The answer is an augmented data warehouse using a system like Domo, which can run two architectures in parallel and knit them together through common interfaces.
Augmenting the data warehouse
Data warehouses that pipe data into OLAP cubes can provide analysis of well-structured data by pre-calculating common views and dimensions of data. This can free up bandwidth for business intelligence analytics on common tasks. Automation of these dimensions also reduces errors and makes management and tracking on hundreds or thousands of data sources feasible.
However, there are areas where cubes aren’t as beneficial, for example with third-party data that needs to be flexible (if it’s occasional or ad-hoc), or with unknown applications that are suited to exploration and agile problem-solving (for instance, if an organization needs to collect large volumes which may not be needed often, and it is not realistic to engineer this data into the cube / enterprise data warehouse format).
The resolution to this problem is to augment the existing data warehouse with a solution like Domo, which covers the two major needs: the flexibility of handling data which isn’t suitable for the enterprise data warehouse, but with all the controls (row level access, usage reporting, and so on) that the enterprise data warehouse provides to IT leadership for its data.
How this works in practice
In new architectures, Domo can replace OLAP cubes altogether while maintaining the same level or better data governance, security, and data lineage so IT can sleep better at night. Alternatively, it can also run alongside an existing data warehouse and focus on different data sources that aren’t well suited to the existing systems.
Domo can then integrate with existing data management and cataloguing, with users able to discover this data in a universal way, through consistent tooling. From a security and compliance perspective, Domo provides row level visibility of data it handles and provides familiar reporting on usage, security and user / group access to datasets, so certification and compliance can be achieved.
When it comes to accessing these data sources together, enterprises can bring data into their existing BI tool of choice. For customers already using a visualization tool from another vendor, Domo can use an ODBC driver to connect their SQL server, and can also use its write-back connectors for bi-directional semantic transformations to occur across databases. Domo provides the backend connected architecture, which can also be further configured via a Java command line interface, or analogs that engineers can use for deeper customization into more restrictive business logic and processes.
Summary
By solving the dark data problem in a way that works alongside an existing data warehouse, both IT and business users can achieve their goals without compromising.
Dark data represents a large percentage of existing data needs and is only likely to grow, so for any existing enterprise architecture where wholesale replacement of a cube-based warehouse is not an option, the priority should be to augment this architecture with a dedicated solution that handles dark data by design and can integrate with existing policies and workflows.
Find out more about Domo’s products or contact us to talk about your current and future needs.