Key Concepts Refresher
Structured Query Language (SQL) is a widely used database programming language standardized by the American National Standards Institute (ANSI) and the International Organization for Standardization (ISO). The standard is often referred to as ANSI SQL.
Despite the existence of these standards, most SQL vendors extend ANSI SQL to provide enhanced capabilities for their individual database products. Microsoft uses an extended version called Transact-SQL (T-SQL), and Oracle uses PL/SQL.
SAP HANA provides SQLScript, which is an extension of ANSI SQL, to provide extra capabilities for the unique in-memory components that SAP deliver. We have already seen many of these features—for example, column tables, parameterized information views, delta buffers, working with multiple result sets in parallel, and built-in currency conversions. The ANSI SQL standard does not provide these features. However, SAP HANA supports ANSI SQL statements.
In this chapter, we provide only a short introduction to SQL and SQLScript, and we do assume some knowledge of SQL.
You can find the full SAP HANA SQL and SQLScript reference guides at http://help.sap.com/hana_platform.
SQL
SQL uses a set-oriented approach for working with data.
[+] Tip
Rather than working with individual records, you should focus on working with the entire result set that consists of many data records.
You specify what should happen to all the data records in the result set at once. This allows databases to be efficient when processing data, because they can parallelize the processing of these records. Databases are faster at processing data than application servers, becacause SQL is a set-oriented declarative language, whereas application servers typically use procedural or object-oriented languages that process records one at a time.
SQL Language
SQL is used to define, manipulate, and control database objects. These actions are specified by various subsets of the language:
- Data Definition Language (DDL)
This is used, for example, toCREATE
,ALTER
(modify),RENAME
, orDROP
(delete) table definitions. - Data Manipulation Language (DML)
This is used to manipulate data—for example, insert new data into a table or update the data in a table. Examples in SQL includeINSERT
,UPDATE
, andDELETE
. - Data Query Language (DQL)
This is used toSELECT
(access) data. It does not modify any data. - Data Control Language (DCL)
This is used for data security—for example, toGRANT
privileges to a database user. - Transaction Control Language (TCL)
This is used to control database transactions—for example, toCOMMIT
orROLLBACK
database transactions.
SQL is also subdivided into several language elements, as listed in Table 8.1.
Element | Short Description |
Names of database objects—for example, a table called |
|
Specify the characteristics of the data stored in the database; for example, a string
can be stored as a |
|
Use to assign, compare, or calculate values—for example, |
|
Clauses that produce result sets consisting of either scalar values or tables consisting
of columns and rows of data. Examples include aggregations like |
|
Specify conditions that can be evaluated to |
|
Retrieve data from the database based on specific criteria. Use the |
|
Commands to create persistent effects on schemas or data and control transactions, program flow, connections, and sessions. |
|
Provide methods to evaluate expressions. Can be used anywhere an expression is allowed.
Various functions are built into SQL—for example, |
Table 8.1 Elements of SQL Language
There are a few important points to remember about SQL when working with SAP HANA:
- Identifiers can be delimited or undelimited, meaning the names of database objects are enclosed in double quotes or not. If you
do not enclose identifiers with quotes, they are treated as uppercase. Therefore,
BOOKS
,Books
, andbooks
all reference the same table name.Delimited identifiers are treated as case-sensitive and can include characters in the name that would be forbidden by undelimited identifiers. SAP HANA uses Unicode, so the range of characters in a name can be wide. In this case, table
"BOOKS"
and table"books"
are treated as two separate tables. NULL
is a special indicator used by SQL to show that there is no value.- The
LIKE
predicate can be used for SQL searches.SELECT author FROM books WHERE title LIKE ‘%HANA%’;
will search for all books with the word HANA in the title. We will look at this in more detail in Chapter 10. - Statements in SQL include the semicolon (
;
) statement terminator. This is a standard part of SQL grammar. Insignificant whitespace is generally ignored in SQL statements and queries and can be used to make such statements and queries more readable.
We’ll now look at some SQL statements that correspond to SAP HANA modeling constructs that you’re already familiar with.
Creating Tables
The short version of the SQL syntax to a create a table is as follows:
CREATE [<table_type>] TABLE <table_name>;
<table_type> ::= COLUMN | ROW | HISTORY COLUMN | GLOBAL TEMPORARY |
GLOBAL TEMPORARY COLUMN | LOCAL TEMPORARY | LOCAL TEMPORARY COLUMN |
VIRTUAL
This means that you have to type “CREATE TABLE BOOKS;” to create an empty table called
BOOKS
. In this example, <tablename>
is BOOKS
. You can choose what to type in place of the <tablename>
placeholder.
Anything in square brackets, like [<table_type>]
, is optional. If you want to type something there, choose from the following list.
If you want your table to be a column table, type “CREATE COLUMN TABLE BOOKS;”, for
example. If you do not specify the type of table, the system will create a row table,
because that’s what ANSI SQL expects. The following are important table types to keep in mind:
HISTORY COLUMN
TheHISTORY COLUMN
table is a special SAP HANA table that stores historical database states. You can execute specialSELECT
statements with a “time-travel clause” on these types of tables to restore to a certain point in time. We can therefore read a table as it was at any point in time. This can be useful for tables that refer to financial transactions and can completely change the way that systems perform period closings.GLOBAL TEMPORARY
andGLOBAL TEMPORARY COLUMN
These are temporary tables that are persisted in the database, even between sessions. Any user with the right privileges can read the table definition (metadata). The data in theGLOBAL TEMPORARY
table can only be inserted and read by the owner of the table. When the session ends, the table is truncated (emptied).LOCAL TEMPORARY
andLOCAL TEMPORARY COLUMN
These are temporary tables that are only available for the current session. No other user can read the table definition (metadata) or the data in this table. The data in the local temporary table can only be inserted and read by the owner of the table. When the session ends, the table is dropped (deleted).
Reading Data
The query—the most common operation in SQL—uses the SELECT
statement—for example:
SELECT [TOP <unsigned_integer>] [ ALL | DISTINCT ]
<select_list> <from_clause> [(<subquery>)] [<where_clause>]
[<group_by_clause>] [<having_clause>] [<order_by_clause>]
[<limit_clause>] [<for_update_clause>] [<time_travel_clause>];
One of the shortest statements that can be used to read data from table BOOKS
is SELECT * FROM BOOKS;
. This returns all the fields and data from table BOOKS
, in effect putting the entire table in the result set. However, we recommend never
using SELECT *
especially not with column tables. Because SAP HANA uses primarily column tables,
this type of SELECT
statement should be avoided.
A better way is to specify the <select_list>
—for example, SELECT author, title, price FROM BOOKS;
. In this case, you only read the author, title, and price columns for all the books.
Note the [<time_travel_clause>]
for use with HISTORY COLUMN
tables.
Filtering Data
In SAP HANA, we sometimes use a filter, by using the WHERE
clause as part of a SELECT
statement—for example:
SELECT author, title, price FROM BOOKS WHERE title LIKE ‘%HANA%’;
Creating Projections
Inside a projection node in SAP HANA views, you can rename fields. This is easily
achieved via the AS
keyword in SQL—for example:
SELECT author, title AS bookname, price FROM BOOKS;
The title
field is now renamed to bookname
in the result set.
Creating Calculated Columns
Creating calculated columns looks almost the same as renaming fields, but there is an added formula, as shown:
SELECT author, title, price * discount AS discount_price FROM BOOKS;
In this case, we added a new calculated column called discount_price
.
Created Nested Reads and Aggregates
In SAP HANA views, we often have many layers; in SQL, we use nested SELECT
statements. These nested queries are also known as subqueries. An example would be as follows:
SELECT author, title, price FROM BOOKS
WHERE price > (SELECT AVG(price) FROM BOOKS);
The SELECT AVG(price) FROM BOOKS
statement returns the average price of all the books in table BOOKS
. This helps return the books that are more expensive than the average price.
You can create aggregates, like with an aggregation node in a calculation view, using standard aggregation functions in SQL. We discussed aggregation in Chapter 4.
Creating Joins
SAP HANA views normally have many joins. Here is an example of how to create an inner join between two tables:
SELECT BOOKS.author, BOOKS.title, BOOKS.price,
BOOK_SALES.total_sold FROM BOOKS
INNER JOIN BOOK_SALES
ON BOOKS.book_number = BOOK_SALES.booknumber;
The field on which both tables are joined is the book_number
field.
Creating Unions
Union nodes in calculation views can be performed by using the UNION
operator—for example:
SELECT author, title, price FROM BOOKS
UNION
SELECT author, title, price FROM BOOKS_OUT_OF_PRINT;
You can also use the UNION ALL
operator instead of UNION
. The UNION
operator returns only distinct values, whereas UNION ALL
will return all records, including duplicate records.
Conditional Statements
You can use the if()
function in an expression in which you create a calculated column in SAP HANA. You
can also handle if-then logic in SQL via the CASE
statement—for example:
SELECT author, title, price,
CASE WHEN title LIKE '%HANA'%' then 'Yes' ELSE 'No' END
AS interested_in_book FROM BOOKS
Here, we have added a calculated column called interested_in_book
to our query. If the book’s title contains the word HANA, we indicate that we are interested in it.
SQLScript
SQLScript exposes many of the memory features of SAP HANA to SQL developers. Column tables, parameterized information views, delta buffers, working with multiple result sets in parallel, built-in currency conversions at the database level, fuzzy text searching, spatial data types, and predictive analysis libraries makes SAP HANA unique. The only way to make this functionality available via SQL queries, even from other applications, is to provide it via SQL extensions.
We’ve come across many of these features in earlier chapters. In the next chapter (Chapter 9), we will look in more detail at the fuzzy text search, spatial functions, and predictive libraries in SAP HANA. We also will learn more about the Application Function Library (AFL) that include predictive and business functions that can be used with SQLScript.
In Chapter 4, we saw the spatial join type, used between a ST_POINT
data type (like a location) and an ST_POLYGON
(like a suburb). The spatial functions can be called from SQLScript to access and
manipulate spatial data.
SAP HANA also has to cater for SAP-specific requirements—for example, limiting data
to a specific “client” (e.g., 100) in the SAP business systems. In SAP HANA, you can
use a SESSION_CONTEXT(CLIENT)
function in the WHERE
clause of a query to limit data in a session to a specific client.
In this section, we will look at the various features of SQLScript.
[»] Note
In traditional client-server approaches, business logic is executed in the application server. With SAP HANA, much of this logic and the executions are pushed down into the SAP HANA database. This approach is different than the standard SQL way of working. SQLScript caters for these requirements.
Separate Statements
One of the unique approaches of SQLScript is the use of variables to break a large, complex SQL statement into smaller, simpler statements. This makes the code much easier to understand and it also helps with SAP HANA’s performance, because many of these smaller statements can be run in parallel.
Let’s look at an example:
books_per_publisher = SELECT publisher, COUNT (*) AS
num_books FROM BOOKS GROUP BY publisher;
publisher_with_most_books = SELECT * FROM
:books_per_publisher WHERE num_books >=
(SELECT MAX (num_books) FROM :books_per_publisher);
We would normally write this as a single SQL statement using a temporary table, or by repeating a subquery multiple times. In our example, we’ve broken this into two smaller SQL statements by using table variables.
The first statement calculates the number of books each publisher has and stores the
entire result set into the table variable called books_per_publisher
. This variable containing the entire result set is used twice in the second statement.
[+] Tip
Notice that the table variable is prepended in SQLScript with a colon (:
) to indicate that this is used as an input variable. All output variables use only
the name, and all input variables have a colon prepended.
The second statement uses :books_per_publisher
as input and uses a nested SELECT
statement.
The SQLScript compiler and optimizer will determine how to best execute these statements, whether by using a common subquery expression with database hints or by combining them into a single complex query. The code becomes easier to write and understand and is more maintainable.
By breaking the SQL statement into smaller statements, we also mirror the way in which we have learned to build graphical information views in SAP HANA. Look at the calculation view in Figure 8.1; if you had to write all of this in SQL, it would be quite difficult to create optimal code.
Figure 8.1 SAP HANA Calculation View in Layers
Just like we start building graphical information views from the bottom up, we do the same with our SQLScript code. We will not write all the SQLScript code here, but just some pseudocode to illustrate the point. We start with a union node in Figure 8.1:
union_output = SELECT * FROM AT_PROD UNION SELECT * FROM PROD;
The next node is a join node. It uses the output from the union node as the source for one side of the join:
join_output = SELECT * FROM :union_output
INNER JOIN SELECT * FROM get_employees_by_name_filter('Jones');
Each time, the output from the lower node becomes the input for the next higher node. As you can see, this is easy to write in SQLScript!
Imperative Logic
Most of the SQL and SQLScript we have worked with to this point has used declarative logic. In other words, we have used the set-oriented paradigm that contributes to SQL’s performance. We have asked the database to hand us back all of the results immediately when the query has completed. Occasionally, you might prefer to receive the answers to a query one record at a time. This is most often the case when running SQL queries from application servers. In these cases, you will use imperative logic, meaning you’ll “loop” through the results one at a time. You can also use conditionals, such as in if-then logic.
Imperative logic is (mostly) not available in ANSI SQL, and therefore SAP HANA provides it as part of SQLScript.
[+] Hint
Remember that imperative logic does not perform as well as declarative logic. If you require the best performance, try to avoid imperative logic.
There are different ways to implement imperative logic in SQLScript:
WHILE
loop
TheWHILE
loop will execute some SQLScript statements as long as the condition of the loop evaluates toTRUE
:WHILE <condition> DO
... some SQLScript statements...
END WHILEFOR
loop
TheFOR
loop will iterate a number of times, beginning with a loop counter starting value and incrementing the counter by one each time, until the counter is greater than the end value:FOR <loop-var> IN [REVERSE] <start_value> .. <end_value> DO
... some SQLScript statements...
END FOR
Dynamic SQL and Security
Another powerful feature in SQLScript is the ability to create and execute dynamic SQL. Let’s illustrate this via an example.
Figure 8.2 shows a web form asking for the user’s mobile telephone number at the top of the
figure. We insert this mobile number into a partially prepared SQL statement, and
then execute this SQL statement using the EXEC
statement.
Figure 8.2 Dynamic SQL with Values from Web Form
Each user gets a unique SQL statement created especially for him or her. The performance is not as good as static SQL statements, but this is more powerful and flexible.
There is one problem: If you’re not careful, hackers can use this setup to break into your system, which has been one of the top security vulnerabilities in web applications for many years now.
Let’s look at the web form again. This time, in Figure 8.3, hackers enter more information than just the mobile telephone number. They add a closing quote and a semicolon. When this is inserted into our prepared SQL statement, it closes the current statement. (Remember that SQL statements are terminated by a semicolon.)
Figure 8.3 Dynamic SQL with SQL Injection from Web Form
Anything after this is executed as another SQL statement—so the hackers can enter
something like DROP DATABASE
that might delete the entire database. Or they can try various SQL statements to
read or modify data values in the database.
SAP HANA SPS 11 includes some new features related to this problem, such as the new IS_SQL_INJECTION_SAFE
function, which helps prevent such attacks.
If you remember to prevent security vulnerabilities like SQL injection, dynamic SQL can enable you to write some powerful applications. You can change data sources at any time, create new tables in real time and link them to applications, and build adaptive user interfaces.
User-Defined (Table and Scalar) Functions
User-defined functions (UDFs) are custom read-only functions in SAP HANA. You write your own functions that can be called in a similar fashion as SQL functions.
There are two types of UDFs: scalar functions and table functions. The main difference is whether you want a function to return values as simple data types (like an integer) or in complete “table-like” result sets. A scalar function can return one or more values, but these values have to be single data types. A table function returns a set of data.
Both types of UDFs are read-only. You may not execute any DDL or DML statements inside these functions.
We create functions as shown in Listing 8.1.
CREATE FUNCTION <func_name> [(<parameter_clause>)]
RETURNS <return_type>
[LANGUAGE SQLSCRIPT]
[SQL SECURITY <DEFINER | INVOKER>]
[DEFAULT SCHEMA <default_schema_name>] AS
{ BEGIN
<function_body>
END }
Listing 8.1 Creating User-Defined Function
Here are a few important points to note with regards to UDFs:
RETURNS
Depending on the type of data we return,RETURNS
will be a scalar or a table function.LANGUAGE
We can only write UDFs in SQLScript.SQL SECURITY
Indicates against which user’s security privileges this function will be executed:- If we choose the
DEFINER
option, this function will be executed by the_SYS_REPO
system user. - If we choose the
INVOKER
option, this function will be executed by the end user calling the function and all security privileges will be limited to that user’s assigned authorizations. - The default for scalar functions is
INVOKER
, and the default for table functions isDEFINER
.
- If we choose the
DEFAULT SCHEMA
When we specify the default schema for this function, we do not have to prepend the schema name before every table name used. We can just writeSELECT FROM table
instead ofSELECT from schema.table
.
In the SAP HANA system, you can create UDFs as discussed in Chapter 5 (see also Figure 5.14).
We create .hdbscalarfunction
objects for scalar functions, and ,hdbtablefunction
objects for table functions. Figure 8.4 shows a newly created table function in SAP HANA studio.
Figure 8.4 Newly Created (Blank) Table Function in SAP HANA Studio
The following two subsections look at examples of both scalar functions and table functions.
Scalar Functions
Scalar UDF supports primitive SQL types as an input. Let’s walk through an example to better understand their usage:
Start by creating a scalar function as in Listing 8.2 that takes the price of a book and a discount percentage as the two inputs and creates two output values. The first output value is the price of the book after the discount is applied, and the second is the discount amount.
FUNCTION apply_discount (book_price decimal(15,2),
book_discount decimal(15,2))
RETURNS discounted_price decimal(15,2)
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER AS
BEGIN
discounted_price := :book_price – ( :book_price * :book_discount);
discount = :book_price * :book_discount;
END;
Listing 8.2 Scalar Function that Returns Two Values
You can call the scalar function with the following code:
SELECT apply_discount (59.50, 0.1). discounted_price as
your_book_price, apply_discount (59.50, 0.1).
discount as discount_amount from DUMMY;
In this case, you get two values back: 53.55 and 5.95.
Figure 8.5 shows a similar example of a scalar function from the SHINE demo content, as shown in SAP HANA studio.
Figure 8.5 Example of Scalar Function Found in SHINE Content
Table Functions
Table UDF supports table types as an input. Let’s walk through an example to better under their usage:
Listing 8.3 shows a modified version of a table function found in the SHINE content. This code
performs an inner join between an employee table and an address table to find the
employee’s email address. The neat trick here is in the WHERE
clause: SAP HANA performs a fuzzy text search on the employee’s last name. You will still get an email address, even if you mistyped
the person’s last name.
FUNCTION get_employees_by_name_filter" (lastNameFilter NVARCHAR(40))
RETURNS table (EMPLOYEEID NVARCHAR(10),
"NAME.FIRST" NVARCHAR(40),
"NAME.LAST" NVARCHAR(40),
EMAILADDR NVARCHAR(255))
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER AS
BEGIN
RETURN
SELECT a.EMPLOYEEID, a."NAME.FIRST", a."NAME.LAST", b.EMAILADDR,
FROM Employees AS a INNER JOIN Addresses AS b
ON a.EMPLOYEEID = b. EMPLOYEEID
WHERE contains("NAME.LAST", :lastNameFilter, FUZZY(0.9));
END;
Listing 8.3 Table Function that Performs Fuzzy Text Search on Last Name to Find Email Addresses
Figure 8.6 shows the table function from the SHINE demo content in SAP HANA studio.
Figure 8.6 Example of Table Function Found in SHINE Content
We can use the table function from Figure 8.6 in a graphical calculation view. An example is shown in Figure 8.7, in which this table function is used in a join with the result set from a union node. (This is not really a good example—we should really join on the key field—but is used here purely to illustrate the point.)
Figure 8.7 Using Table Function in Graphical Calculation View
Procedures
A procedure is a subroutine consisting of SQL statements. Procedures can be called or invoked by applications that access the database. Procedures are sometimes also known as stored procedures.
Procedures are more powerful than UDFs. Whereas UDFs are read-only and cannot change any data in the system, procedures can insert, change, and delete data. You can execute any DDL or DML statements inside a procedure. Whereas a table function can only return one result set, procedures can return many result sets.
Listing 8.4 illustrates how to create procedures.
CREATE PROCEDURE <proc_name> [(<parameter_clause>)]
[LANGUAGE <SQLSCRIPT | R>]
[SQL SECURITY <DEFINER | INVOKER>]
[DEFAULT SCHEMA <default_schema_name>]
[READS SQL DATA [WITH RESULTS VIEW <view_name>]] AS
{ BEGIN [SEQUENTIAL EXECUTION]
<procedure_body>
END }
Listing 8.4 Creating Procedures
The following syntax are used in Listing 8.4:
LANGUAGE
We can write procedures in SQLScript or in R. R is a popular open-source programming language and data analysis software environment for statistical computing.SQL SECURITY
Indicates against which user’s security privileges this function will be executed:- If we choose the
DEFINER
option, this function will be executed by the_SYS_REPO
system user. - If we choose the
INVOKER
option, this function will be executed by the end user calling the function, and all security privileges will be limited to that user’s assigned authorizations. - The default for procedures is
DEFINER
(i.e., the_SYS_REPO user
).
- If we choose the
DEFAULT SCHEMA
When we specify the default schema for this function, we do not have to prepend the schema name before every table name you use.READS SQL DATA
Marks the procedure as being read-only. The procedure then cannot change data or database object, and cannot contain DDL or DML statements. If we try to activate a read-only procedure that contains DDL or DML statements, we will get activation errors.WITH RESULTS VIEW
Specifies that a view (that we have to supply) can be used as the output of a read-only procedure. When a result view is defined for a procedure, it can be called by an SQL statement in the same way as a table or view!SEQUENTIAL EXECUTION
Forces sequential execution of the procedure logic; no parallelism is allowed.
Some important points to note with regards to these procedures:
- A procedure does not require any input and output parameters. Input and output parameters must be explicitly typed. The input and output parameters of a procedure can have any of the primitive SQL types or a table type. (Table types will be discussed in the next section.)
- If we require table outputs from a procedure, we have to use table types.
- Any read-only procedure may only call other read-only procedures. The advantage of using a read-only procedure is that certain optimizations are available for read-only procedures, such as not having to check for database locks; this is one of the performance enhancements in SAP HANA.
- Note that
WITH RESULTS VIEW
is a sub-clause of theREADS
SQL clause, so the same restrictions apply to it as those for read-only procedures.
We mentioned that if we require table outputs from a procedure, we have to use table types. In SQLScript, we can also define table types. Table types are similar to database tables, but do not have an instance. They are like a template for future use. They are used to define parameters for a procedure that needs to use tabular results.
Table types are created with the CREATE TYPE
statement. The statement used to be CREATE TABLE TYPE
, but this syntax has been deprecated. The definition is as follows:
CREATE TYPE <type_name> AS TABLE (<column_list_definition>)
The SQL syntax for defining table types is the same as that for defining new tables.
Let’s switch gears. Similar to UDFs, we can create procedures as discussed in Chapter 5, illustrated by Figure 5.14. In this case, we create a .hdbprocedure
object.
Figure 8.8 shows a newly created table function in SAP HANA studio.
Figure 8.8 Example Procedure Found in SHINE Content
Procedures are called with the CALL
statement. You can call the procedure shown in Figure 8.8 with the following statement:
CALL "get_product_sales_price" ('HT-1000', ?);
The results of this call statement, as executed in the SQL Console in SAP HANA studio, are shown in Figure 8.9.
Figure 8.9 Results of Calling a Procedure Directly from SQL Console
You can call procedure with multiple result sets, which reads and writes transactional
data from an SAP HANA XS application using server-side JavaScript. The REST-style web services created will
use HTTP GET
for reading data and HTTP PUT
for writing data.
You will notice that how we use procedures is quite different from how we use UDFs. We can use procedures in a similar way as functions if we specify the read-only and result view clauses.
In Listing 8.5, we create a procedure that uses a precreated result view called View4BookProc
, and a table type called BOOKS_TABLE_TYPE
.
CREATE PROCEDURE ProcWithResultView(IN book_num VARCHAR(13),
OUT output1 BOOKS_TABLE_TYPE)
LANGUAGE SQLSCRIPT
READS SQL DATA WITH RESULT VIEW View4BookProc AS
BEGIN
output1 = SELECT author, title, price FROM BOOKS WHERE
isbn_num = :book_num;
END;
Listing 8.5 Read-Only Procedure Using Result View
We can now call this procedure from a SQL statement, just like a table function:
SELECT * FROM View4BookProc (PLACEHOLDER."$$book_num$$"=>'978-1-4932-1230-9');
Multilevel Aggregation
Sometimes, you may need to return multiple result sets with a single SQL statement. In normal SQL, this might not always be possible—but we can use multilevel aggregation with grouping sets to achieve this in SQLScript.
Let’s start by looking at an example (Listing 8.6) in SQLScript of how we would create two result sets.
-- Create result set 1
books_pub_title_year = SELECT publisher, name, year, SUM(price)
FROM BOOKS WHERE publisher = pubisher_id
GROUP BY publisher, name, year;
-- Create result set 2
books_year = SELECT year, SUM(price) FROM BOOKS
GROUP BY year;
Listing 8.6 Creating Two Result Sets in SQLScript
You can create both of these result sets from a single SQL statement using GROUPING SETS
, as shown in Listing 8.7.
SELECT publisher, name, year, SUM(price)
FROM BOOKS WHERE publisher = pubisher_id
GROUP BY GROUPING SETS ((publisher, name, year), (year))
Listing 8.7 Creating Multiple Result Sets from Single SQL Statement with Grouping Sets
You now have a good understanding of SQLScript and the powerful capabilities it brings to your modeling toolbox. Let’s now look at when you should use this functionality and how to ensure that you use the latest variations of these features.
Views, Functions, and Procedures
Taking a step back, let’s look at how to best address our modeling requirements. In SAP HANA SPS 11 and later we have three main approaches: views, UDFs, and procedures.
Let’s first discuss each of these options, and then we’ll discuss when you should use each one.
Information Views
We have discussed SAP HANA information views in detail in preceding chapters. You have built these views using the graphical environment, and you have now also learned how to also create these view using SQLScript.
Some of the characteristics of views include the following:
- Views are treated like tables. You can use
SELECT
,WHERE
,GROUP BY
, andORDER BY
statements on views. - Views are static. They always return the same columns, and you look at a sets of data in the same way.
- They always return one result set.
- Views are mostly used for reading data.
User-Defined Functions
After learning about UDFs, you should be familiar with their characteristics, such as the following:
- Functions always return results. A table function returns a set, and a scalar function returns one or more values.
- Functions require input parameters.
- The code in UDFs is read-only. Functions cannot change, update, or delete data or
database objects. You cannot insert data with functions, so you cannot use transactions
or
COMMIT
andROLLBACK
in functions. - Functions cannot call
EXEC
statements; otherwise, you could build a dynamic statement to change data and execute it. - You can call the results of functions in
SELECT
statements and inWHERE
andHAVING
clauses. - Functions can be used in a join.
- Table functions cannot call procedures, because procedures can change data. You have to hold to the read-only paradigm of functions.
- Functions use SQLScript code. This can include imperative logic in which you loop through results or apply an if-then logic flow to the results.
Procedures
Procedures are the most powerful of the three approaches. Some of the characteristics of procedures include the following:
- Procedures can use transactions, with
COMMIT
andROLLBACK
. They can modify tables and data, unless they are marked as read-only. Read-only procedures may only call other read-only procedures. - Procedures can call UDFs, but UDFs cannot call procedures.
- Procedures do not have to return any values but may return multiple result sets. Procedures also do not need to specify any input parameters.
- Procedures provide full control, because all code functionality is allowed in procedures.
- Procedures can use
try-catch
blocks for errors. - Procedures can also use dynamic SQL—but be aware of security vulnerabilities such as SQL injection when using dynamic SQL.
When to Use What Approach
After all that detail, the recommendation of when to use each of the three approaches is quite easy to make.
In order of functionality, we have views, then UDFs, and then procedures; that is, procedures have the most functionality.
In order of performance, we have the reverse. Views give the best performance, then UDFs (because they are permanently read-only), and then procedures.
Use the minimal approach to get the results you require.
If you can do it with views, then use views. If you cannot do it with views, then try using user-defined functions. If that still does not meet the requirements, create a procedure.
Catching Up with SAP HANA SPS 11
Figure 8.7 illustrated how to use a table function in a graphical calculation view. This options changes how we approach modeling in SAP HANA projects. The emphasis has moved completely to the graphical modeling environment.
We can work graphically with our SAP HANA views, create UDFs or procedures for the occasional constraint, and bring these back into the graphical SAP HANA modeling environment. We can then continue building the remainder of our SAP HANA information models using the graphical information views.
Many of the SAP HANA modeling constructs and objects in SQLScript have changed to accommodate this shift in emphasis to the graphical environment. In the last few years, almost everything has changed!
Table 8.2 contrasts the old with the new.
Action | The Old Method | The New Method |
Create procedures |
Create run-time objects
|
Create design-time objects
|
Write SQLScript code |
Use SQLScript “CE_” functions
|
Use SQL syntax
|
Use SQLScript |
SQLScript (scripted) calculation views |
Table functions |
Create procedures |
Create a procedure in the Content area or Create a .procedure file |
Create a .hdbprocedure file |
Create table types |
|
|
Table 8.2 Changes in SQL and SQLScript
Using the Latest Modeling Practices
To move from the old SQL and SQLScript modeling approaches to the newer techniques, use the following guidelines:
- Many modelers and developers used the SQL Console to create procedures. This created
runtime objects, which cannot be transported to the production system. The recommended
way is to create design-time objects—for example,
.hdbprocedure
objects—instead of using theCREATE PROCEDURE
statement. - In the early days of SAP HANA, there was a huge emphasis on using certain calculation
engine functions (called CE-functions). The performance of these CE-functions was
much better (back then) than using standard SQL syntax; you would use
CE_UNION_ALL(:A, :B)
instead ofA UNION ALL B
to improve performance. This has now changed, and the new recommendation is to forget about CE-functions and use the SQL syntax. - There were two different ways to create procedures before—namely, in the context menu of a package (where we also create graphical views) and by using a .procedure file. Both of these methods are deprecated. You should now use a .hdbprocedure file to create procedures in SAP HANA.
- The old
CREATE TABLE TYPE
syntax to create a table type is now deprecated and replaced by the newCREATE TYPE
syntax. - When creating calculation views, there is still a dropdown option to create SQLScript calculation views. These are also known as scripted calculation views or script-based calculation views. We recommend not using this type of calculation view; instead, use table functions. In fact, as of SAP HANA SPS 11, the migration tool now migrates script-based calculation views to table functions for you.
We will now see how to use the migration tool to convert script-based calculation views to table functions.
Migrating Scripted Calculation Views
Start the migration tool from the Quick View tab in the Modeler perspective. On the next screen (Figure 8.10), you’ll see the migration tool dialog.
Here, we chose the option to migrate script-based calculation views to table functions. Figure 8.11 shows the next step, in which you select all the calculation views you want to migrate.
Figure 8.11 Selecting Scripted Calculation Views to Convert to Table Functions
Figure 8.12 shows a script-based calculation view before migration.
Figure 8.12 Script-Based Calculation View before Migration
The same script-based calculation view was converted to a table function. In Figure 8.13, you can see that both a table function (top part of the figure) and a graphical calculation view that uses this table function (bottom part of the figure) were created.
Figure 8.13 Scripted Calculation View Migrated to Table Function and Graphical Calculation View
In this case, we did not have to create the table function manually; the migration tool created it for us.