数据仓库辅导 | Data Warehousing and Business Intelligence (H9DW)
Note: All questions in the paper refer to the application of a data warehouse and/or business intelligence use cases within the broad scenario of restaurant sales. You should answer ALL questions in reference to the scenario described below.
An existing data warehouse is in place, with the following data model:
Presently, all food and beverage products sold are provided by a series of external suppliers. They company offer both cheap fast food as well as high-end gourmet food options. However, they have noticed from consumer feedback surveys that offering both food types within one establishment is confusing their clientele. Thus, they are now only offering fast food at half their restaurants and gourmet meals at the other half of their restaurants. This requires a redesign of their data warehouse, which is now no longer fit for purpose. In addition, they also wish to include additional sources of data:
Source 1: Trip Advisor Reviews linked specific to restaurant location(s) – they are interested in feedback in customer response(s) to differentiating their business lines into fast food and gourmet Source 2: Details about customers in purchasing food
Source 3: Socio-economic data from the Central Statistics Office pertaining to the wealth and poverty present at different restaurant locations.
They provide 3 additional requirements:
R1: The DW should differentiate between fast food and gourmet restaurants.
R2: The DW should be able to capture entire orders, not just individual line items. R3: They have a limited budget for this project, and require a timely implementation.
1. Foundations [24 Marks]
a) Which of Kimball’s requirements for a Data Warehouse would be violated through this
redesign, and why? [8 Marks]
b) Aside from violating core requirements, what are the 3 biggest technical or organisational
challenges you would expect to encounter? [6 Marks] Page 2 of 3
NCI DOWNLOAD 03.07.19 00:06:59
c) The company are willing to also consider changing their DW implementation; currently a federated architecture stemming from a recent merger. Compare and evaluate their current implementation with the “best” alternative option, noting which they should use and why. [10 Marks]
2. Modelling [33 Marks]
You are to extend the current data model to encapsulate the new change in business operations, and
include the additional sources of data. You may add, remove and modify dimensions as you see fit.
a) Highlight the 3 most critical issues to be resolved with the current star schema. [6 Marks]
b) Sketch the new star-join. [15 Marks]
c) Defend how your redesign meets requirements R1 and R2 with respect to affording decision
support for the company. [8 Marks]
d) Provide 2 sample reporting BI queries that the new schema can answer [4 marks]
3. ETL [28 Marks]
Propose the ETL strategy to populate the data warehouse data model proposed in question 2. Accordingly, answer the following questions:
a) Provide 1 sample row of the fact table and 2 dimensions in your star-join and note the source data used to populate each column of each table. [6 Marks]
b) Describe how you will extract relevant data from the 3 new sources. [6 Marks]
c) Describe (with explicit examples in reference to your sample rows) how you will transform
the data for the data warehouse. [8 Marks]
d) What methods of cleaning and consistency checking will you perform? [4 Marks]
e) How will you load the data warehouse for the first and subsequent monthly loads? [4
4. Ethics [15 Marks]
The data warehouse has now been suitably extended. It provides data for a predictive analytics application that seeks to predict the probability that a customer will complain. On the basis of this probability, the company would like staff to decide whether or not to serve a customer. Discuss the ethics of such data-driven decision making in the context of moral agency. You should identify and discuss different types of employee in your answer.