Set up |
Materialized View |
CREATE MATERIALIZED VIEW sales_view AS
SELECT country_name country, prod_name prod,
calendar_year year,
SUM(amount_sold) sale, COUNT(amount_sold) cnt
FROM sales, times, customers, countries, products
WHERE sales.time_id = times.time_id
AND sales.prod_id = products.prod_id
AND sales.cust_id = customers.cust_id
AND customers.country_id = countries.country_id
GROUP BY country_name, prod_name, calendar_year; |
|
Basic Model
Clause |
model [main]
[ reference models ]
[ partition by (<cols>)]
dimension by (<cols>)
measures (<cols>)
[ ignore nav ] | [ keep nav ]
[ rules
[ upsert | update]
[ automatic order | sequential order ]
[ iterate (n) [ until <condition>]
]
(<cell_assignment> = <expression> ...) |
This statement partitions data by country, so the
formulas are applied to data of one country at a time. Our sales fact data ends with 2001,
so any rules defining values for 2002 or later will insert new cells. The first rule
defines the sales of a video games called "Bounce" in 2002 as the sum of its
sales in 2000 and 2001. The second rule defines the sales for Y Box in 2002 to be the same
value they were for 2001. The third rule defines a product called "2_Products,"
which is simply the sum of the Bounce and Y Box values for 2002. Since the values for
2_Products are derived from the results of the two prior formulas, the rules for Bounce
and Y Box must be executed before the 2_Products rule. |
col country format a20
col prod format a20
SELECT SUBSTR(country,1,20) country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country IN ('Italy','Japan')
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (
sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000],
sales['Y Box', 2002] = sales['Y Box', 2001],
sales['2_Products', 2002] = sales['Bounce', 2002] + sales['Y Box',2002])
ORDER BY country, prod, year; |
Positional Cell Reference Single cell access and upserts |
The value for the cell reference is matched to the
appropriate dimension based on its position in the expression. The DIMENSION BY clause of
the model determines the position assigned to each dimension: in this case, the first
position is product ("prod") and the second position is year. |
SELECT SUBSTR(country,1,20) country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (sales['Bounce', 2000] = 10)
ORDER BY country, prod, year; |
Add a new cell |
The formula in the query above sets the year value to 2005 and thus creates a new cell in the array. |
SELECT SUBSTR(country,1,20) country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (
sales['Bounce', 2005] = 20 )
ORDER BY country, prod, year; |
Symbolic Cell Reference: MultiI-Cell Access
And Updates
|
The query, below, uses "symbolic cell
reference." With symbolic cell references, the standard SQL conditions are used
to determine the cells which are part of a formula. You can use conditions such as
<,>, IN, and BETWEEN. In this example the formula applies to any cell which has
product value equal to Bounce and a year value greater than 1999. The example shows how a
single formula can access multiple cells. |
SELECT SUBSTR(country,1,20) country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (sales[prod='Bounce', year>1999] = 10)
ORDER BY country, prod, year; |
Positional And Symbolic Cell References In A Single Query |
Since our example data has no values beyond the year
2001, any rule involving the year 2002 or later requires insertion of a new cell. The same
applies to any new product name we define here. In the third formula we define a new
product '2_Products' for 2005, so a cell will be inserted for it. The first rule, for
Bounce in 2002, inserts new cells since it is positional notation. The second rule, for Y
Box, uses symbolic notation, but since there are already values for 'Y Box' in the year
2001, it updates those values. The third rule, for '2_Products' in 2005, is positional, so
it can insert new cells, and we see them in the output. |
SELECT SUBSTR(country,1,20)
country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view WHERE country IN ('Italy','Japan')
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (
sales['Bounce', 2002] = sales['Bounce', year = 2001] ,
--positional notation: can insert new cell
sales['Y Box', year>2000] = sales['Y Box', 1999],
--symbolic notation: can update existing cell
sales['2_Products', 2005] = sales['Bounce', 2001] +
sales['Y Box', 2000] )
--positional notation: permits creation of new cell
--for new product
ORDER BY country, prod, year; |
Multi-Cell References On The Right Side Of
A Formula |
In the query above we use a BETWEEN condition to
specify multiple cells on the right side of the formula, and these are aggregated to a
single value with the MAX() function. |
SELECT SUBSTR(country,1,20)
country, SUBSTR(prod,1,15) prod,
year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (sales['Bounce', 2005] = 100 + MAX(sales)['Bounce',
year BETWEEN 1998 AND 2002])
ORDER BY country, prod, year; |
|
CV Function |
Use Left-Side Values In Right Side
Calculations |
The two CV() functions used in the formula return the
year dimension value of the cell currently referenced on the left side. When the left side
of the formula above references the cell 'Bounce' and 1999, the right side expression would resolve to:
sales['Mouse Pad', 1999] + 0.2 * sales['Y Box',
1999].
Similarly, when the left side references the cell 'Bounce' and 2000, the right side expression we would evaluate is:
sales['Mouse Pad', 2000] + 0.2 * sales['Y Box',
2000].
CV() function takes a dimension key as its argument. It is also possible to use CV()
without any argument as in CV() which causes positional referencing. Therefore the formula above can be written as:
s['Bounce', year BETWEEN 1995 AND 2002] = s['Mouse Pad', CV()] + 0.2
* s['Y Box', CV()] |
SELECT SUBSTR(country,1,20)
country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country ='Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (sales['Bounce', year BETWEEN 1995 AND 2002] =
sales['Mouse Pad', CV(year)] + 0.2 * sales['Y Box', CV(year)])
ORDER BY country, prod, year; |
Used In Expressions For Inter-Row
Calculations |
Note that the blank cells in the results are NULLs.
The formula results in a null if there is no value for the product two years earlier. None
of the products has a value for 1998, so in each case the 1999 growth calculation is NULL. |
SELECT SUBSTR(country,1,10)
country, SUBSTR(prod,1,10) prod, year, sales, growth_pct
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales, 0 growth_pct)
RULES (growth_pct[prod IN ('Bounce','Y Box', 'Mouse Pad'),
year BETWEEN 1998 and 2001] = 100 * (sales[CV(prod), CV(year)] -
sales[CV(prod), CV(year)-1]) / sales[CV(prod), CV(year)-1])
ORDER BY country, prod, year; |
|
Wildcard With ANY
Keyword |
Any Keyword Demo |
Note that in the MEASURES clause above, we use the
placeholder value of 0 when specifying the new measure growth_pct. Other numbers would
also work as placeholder values. This query gives the same results as the prior query
because the full data set ranges from 1998 to 2001, and that is the range specified in the
prior query.
ANY can be used in cell references to include all dimension values including NULLs. In
symbolic reference notation, we use the phrase "IS ANY". Note that the ANY
wildcard prevents cell insertion when used with either positional or symbolic notation. |
SELECT SUBSTR(country,1,10)
country, SUBSTR(prod,1,10) prod, year, sales, growth_pct
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales, 0 growth_pct)
RULES (growth_pct[prod IN ('Bounce','Y Box','Mouse Pad'),
ANY] = 100 * (sales[CV(prod), CV(year)] - sales[CV(prod), CV(year)-1]) /
sales[CV(prod), CV(year) -1])
ORDER BY country, prod, year; |
|
FOR Loops |
FOR LOOP To Create Specify New Cells |
By using positional notation on the left side of the
formulas, we ensure that cells for these products in the year 2005 will be inserted if
they are not already present in the array. This technique is bulky since it requires as
many formulas as there are products. If we have to work with dozens of products, it
becomes an unwieldy approach. With FOR we can reword this computation so it is concise yet
has exactly the same behavior.
The FOR construct can be thought of as a tool to make a single formula generate multiple
formulas with positional references, thus enabling creation of new cells (UPSERT
behavior). |
SELECT SUBSTR(country,1,20)
country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (sales[FOR prod IN ('Mouse Pad', 'Bounce', 'Y Box'),
2005] = 1.3 * sales[CV(prod), 2001])
ORDER BY country, prod, year; |
FOR LOOP To Range Over A Value Sequence |
Project sales values of Mouse Pad for the years 2005
to 2012 so that they are equal to 120% of the value in 2001 using the sequence generation
of the FOR construct.
This kind of FOR construct can be used for dimensions of numeric, date and datetime
datatypes. The increment/decrement expression of the FOR loop should be numeric for
numeric dimensions and can be numeric or interval for dimensions of date or datetime
types. |
SELECT SUBSTR(country,1,20)
country, SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (sales['Mouse Pad', FOR year FROM 2005 TO 2012 INCREMENT 1]
= 1.2 * sales[CV(prod), 2001])
ORDER BY country, prod, year; |
|
Order Of Evaluation |
Automatic Ordering Of Rule Application |
By default, formulas are evaluated in the order they
appear in the MODEL clause.
The following query uses Automatic Order, so it recognizes that Bounce and Y Box sales
must be calculated before 2_Products sales. |
SELECT SUBSTR(country,1,20)
country, SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country IN ('Italy', 'Japan')
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES AUTOMATIC ORDER (sales['2_Products', 2002] =
sales['Bounce', 2002] + sales['Y Box', 2002],
sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000], sales['Y Box',
2002] = sales['Y Box', 2001])
ORDER BY country, prod, year; |
Sequential Ordering Of Rule Application |
Same as the above but with sequential ordering. This
query uses Sequential Order, so it does not calculate the values for Bounce and Y Box
before 2_Products, and 2_Products is assigned as NULL. |
SELECT SUBSTR(country,1,20)
country, SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country IN ('Italy','Japan')
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES SEQUENTIAL ORDER (sales['2_Products', 2002] =
sales['Bounce', 2002] + sales['Y Box', 2002],
sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000], sales['Y Box',
2002] = sales['Y Box', 2001])
ORDER BY country, prod, year; |
|
Null Measures And
Missing Cells |
|
By default, NULL cell measure values are treated the
same way as NULLs are treated elsewhere in SQL. Missing cells are treated as cells with
NULL measure value. |
SELECT SUBSTR(country,1,20)
country, SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (sales['Mouse Pad', 2005] =
sales['Mouse Pad', 1999] + sales['Mouse Pad', 2004])
ORDER BY country, prod, year; |
|
Once a NULL has been found in a set of values, it can
cause many dependent calculations to result in NULLs. Therefore it can be more useful to
treat NULLS and missing values as non-NULL values. In this way, NULLs will not be
propagated through a set of calculations. You can use the IGNORE NAV option (NAV stands
for non-available values) to default NULLs and missing cells to the following values:
- 0 for numeric data
- empty string for character/string data
- 01-JAN-2001 for data type data
- NULL for other data types
Note that the default behavior is KEEP NAV which treats NULLs in the standard manner and
treats missing values as NULLs. When we add the IGNORE NAV option to the query above, it
returns a numeric value for sales even though the value for 2004 is missing.
|
SELECT SUBSTR(country,1,20)
country, SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country = 'Italy'
MODEL IGNORE NAV RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (
sales['Mouse Pad', 2005] =
sales['Mouse Pad', 1999] + sales['Mouse Pad', 2004])
ORDER BY country, prod, year; |
|
Reference Models |
Calculate projected sales figures for two
different countries and show these projections in both the country currency and US dollars |
REFERENCE model_name ON (query)
DIMENSION BY (cols)
MEASURES (cols) [reference options] |
CREATE TABLE dollar_conv(country
VARCHAR2(30),
exchange_rate NUMBER);
INSERT INTO dollar_conv VALUES('Canada', 0.75);
INSERT INTO dollar_conv VALUES('Brazil', 0.34);
COMMIT;
col localsales format 9,999,999.99
col dollarsales format 999,999.99
SELECT SUBSTR(country,1,20) country, year,
localsales, dollarsales
FROM sales_view
WHERE country IN ( 'Canada', 'Brazil')
GROUP BY country, year
MODEL RETURN UPDATED ROWS
REFERENCE conv_refmodel ON (
SELECT country, exchange_rate AS er FROM dollar_conv)
DIMENSION BY (country) MEASURES (er) IGNORE NAV
MAIN main_model
DIMENSION BY (country, year)
MEASURES (SUM(sale) sales, 0 localsales, 0 dollarsales)
IGNORE NAV RULES (
/* assuming that sales in Canada grow by 22% */
localsales['Canada', 2005] = sales[CV(country), 2001] * 1.22,
dollarsales['Canada', 2005] = sales[CV(country), 2001] * 1.22 *
conv_refmodel.er['Canada'],
/* assuming that sales in Brazil grow by 28% */
localsales['Brazil', 2005] = sales[CV(country), 2001] * 1.28,
dollarsales['Brazil', 2005] =
sales[CV(country), 2001] * 1.28 * er['Brazil']); |
Working With Arrays Of Different
Dimensionality |
Calculate sales for Brazil and Canada, applying the
2005 growth figures and converting the values to US dollars. |
CREATE TABLE growth_rate(
country VARCHAR2(30),
year NUMBER, growth_rate NUMBER);
INSERT INTO growth_rate VALUES('Brazil', 2004, 17);
INSERT INTO growth_rate VALUES('Brazil', 2005, 28);
INSERT INTO growth_rate VALUES('Canada', 2004, 13);
INSERT INTO growth_rate VALUES('Canada', 2005, 22);
COMMIT;
SELECT SUBSTR(country,1,20) country, year, localsales, dollarsales
FROM sales_view
WHERE country IN ('Canada','Brazil')
GROUP BY country, year
MODEL RETURN UPDATED ROWS
REFERENCE conv_refmodel ON (
SELECT country, exchange_rate FROM dollar_conv)
DIMENSION BY (country c)
MEASURES (exchange_rate er) IGNORE NAV
REFERENCE growth_refmodel ON (
SELECT country, year, growth_rate FROM growth_rate)
DIMENSION BY (country c, year y)
MEASURES (growth_rate gr) IGNORE NAV
MAIN main_model
DIMENSION BY (country, year)
MEASURES (SUM(sale) sales, 0 localsales, 0 dollarsales) IGNORE NAV
RULES (
localsales[FOR country IN ('Brazil', 'Canada'), 2005] =
sales[CV(country), 2001] * (100+gr[CV(country), CV(year)])/100 ,
dollarsales[FOR country IN ('Brazil', 'Canada'),2005] =
sales[CV(country), 2001] * (100 + gr[CV(country), CV(year)])/100 * er[CV(country)])
ORDER BY country; |
|
ITERATE Models |
Iterate through and evaluate formulas a
specified number of times |
ITERATE (<number_of_iterations>) |
CREATE TABLE ledger (
account VARCHAR2(20),
balance NUMBER(10,2));
INSERT INTO ledger VALUES ('Salary', 100000);
INSERT INTO ledger VALUES ('Capital_gains', 15000);
INSERT INTO ledger VALUES ('Net', 0);
INSERT INTO ledger VALUES ('Tax', 0);
INSERT INTO ledger VALUES ('Interest', 0);
COMMIT;
SELECT account, bal
FROM ledger
MODEL IGNORE NAV
DIMENSION BY (account)
MEASURES (balance bal)
RULES SEQUENTIAL ORDER
ITERATE (100) (
bal['Net'] = bal['Salary'] + bal['Capital_gains']
- bal['Interest'] - bal['Tax'],
bal['Tax'] = (bal['Salary'] - bal['Interest']) * 0.38
+ bal['Capital_gains'] * 0.28,
bal['Interest'] = bal['Net'] * 0.30); |
|
ITERATE With An UNTIL Condition |
Until Condition Demo |
ITERATE (<number_of_iterations>) UNTIL
(some_condition_is_met>) |
col bal format 999,999.99
SELECT account, bal
FROM ledger
MODEL IGNORE NAV
DIMENSION BY (account)
MEASURES (balance bal)
RULES SEQUENTIAL ORDER
ITERATE (100) UNTIL (ABS(
(PREVIOUS(bal['Net'])-bal['Net'])) < 0.01)
(bal['Net'] = bal['Salary'] + bal['Capital_gains'] -
bal['Interest'] - bal['Tax'],
bal['Tax'] = (bal['Salary'] - bal['Interest']) * 0.38 + bal['Capital_gains'] *0.28,
bal['Interest'] = bal['Net'] * 0.30,
bal['Iteration Count'] = ITERATION_NUMBER + 1); |
|
ITERATION_NUMBER |
Returns an integer representing the completed iteration through the model rules. The ITERATION_NUMBER function returns 0 during the first iteration. For each subsequent iteration,
it returns the equivalent of iteration_number plus one. |
ITERATION_NUMBER( |
CREATE OR REPLACE VIEW sales_view_ref AS
SELECT country_name country, prod_name prod, calendar_year year,
SUM(amount_sold) sale, COUNT(amount_sold) cnt
FROM sales, times, customers, countries, products
WHERE sales.time_id = times.time_id
AND sales.prod_id = products.prod_id
AND sales.cust_id = customers.cust_id
AND customers.country_id = countries.country_id
AND (customers.country_id = 52779 OR customers.country_id = 52776)
AND (prod_name = 'Standard Mouse' OR prod_name = 'Mouse Pad')
GROUP BY country_name,prod_name,calendar_year;
set linesize 121
col country format a20
col prod format a20
SELECT country, prod, year, s
FROM sales_view_ref
MODEL
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale s)
IGNORE NAV
UNIQUE DIMENSION
RULES UPSERT SEQUENTIAL ORDER ITERATE(2)
(s['Mouse Pad', 2001 + ITERATION_NUMBER] =
s['Mouse Pad', 1998 + ITERATION_NUMBER])
ORDER BY country, prod, year; |
|
PRESENTNNV |
Returns expr1 when, prior to the execution of the model_clause, cell_reference exists and is not null. Otherwise it returns expr2. |
PRESENTNNV(<cell_reference>,
<expression1>, <expression2>) |
conn sh/sh
set linesize 121
col country format a20
col prod format a20
SELECT country, prod, year, s
FROM sales_view_ref
MODEL
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale s)
IGNORE NAV
UNIQUE DIMENSION
RULES UPSERT SEQUENTIAL ORDER (
s['Mouse Pad', 2002] =
PRESENTNNV(s['Mouse Pad', 2002], s['Mouse Pad', 2002], 10))
ORDER BY country, prod, year; |
|
PRESENTV |
Returns expr1 when, prior to the execution of the model_clause, cell_reference exists. Otherwise it returns expr2. |
PRESENTV(<cell_reference>,
<expression1>, <expression2>) |
conn sh/sh
set linesize 121
col country format a20
col prod format a20
SELECT country, prod, year, s
FROM sales_view_ref
MODEL
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale s)
IGNORE NAV
UNIQUE DIMENSION
RULES UPSERT SEQUENTIAL ORDER (
s['Mouse Pad', 2001] =
PRESENTV(s['Mouse Pad', 2000], s['Mouse Pad', 2000], 0))
ORDER BY country, prod, year; |
|