5.3    Introduction to Open SQL Statements

SAP supports different relational database management systems (RDBMSs) like Oracle, Microsoft SQL, and SAP HANA. Structured Query Language (SQL) is a programming language designed for working with data in a relational database management system.

ABAP supports two types of SQL:

SQL statements are categorized into the following languages:

With Open SQL, you can use DML statements. With Native SQL, you can use DDL statements. We generally use ABAP Data Dictionary to perform tasks that require using DDL statements with Native SQL.

In this section, we’ll discuss various key concepts of an RDBMS, such as database design, tables and keys, foreign key relationships, and normalization. We’ll also provide an introduction to Open SQL statements used to fetch data from database tables.

5.3.1    Database Overview

As discussed in Chapter 2, the database is managed by an RDBMS. A relational database consists of multiple tables that store particular sets of data, and an RDBMS standardizes how the data is stored and processed. In a relational database, data is stored in multiple tables, with a relationship between the tables.

The relational database model was first conceived by E. F. Codd in 1969. The model is based on set theory and predicate logic (branches of mathematics). The idea behind the relational model is that the database consists of a series of tables that can be manipulated using a declarative approach (nonprocedural operations).

It’s commonly thought that the word relational in a relational database pertains to the fact that you relate the tables in the database to one another. However, a table is also known as a relation, and the word relational takes its root from this. In fact, Codd and other database theorists use the terms relations, attributes, and tuples to refer to what developers call tables, columns (fields), and rows (or what we’d call files, fields, and records in a physical sense).

In the following subsections, we’ll explain how a database is modeled. We’ll start by discussing the use of database tables and the significance of table keys. We’ll then explore the concept of foreign key relationships to establish relationships between different tables of the database. We’ll also discuss the concept of normalization, which helps to simplify database design and avoid redundancy. We’ll conclude this section with an introduction to SAP HANA and the many benefits this database provides.

Relational Database Design

A database model to reflect a real-world system takes time and effort to conceive, build, and refine. You need to make decisions about the tables you need to create, what fields they’ll contain, and how the tables can be related to each other.

A good database that’s designed according to the relational model provides many advantages. A good design makes the data entries, updates, and deletions efficient, and it also helps make retrieval and summarization of data easy. Because most data is stored in the database and not in the application, the database should be sufficiently self-documenting. A good database design should also consider future changes in real-world requirements and should be flexible to allow for changes to the database schema.

Tables and Keys

A table is the basic entity in a relational model. Each table should represent an entity in the real world, and these entities can be real-world objects or events. For example, a customer is a real-world object, and the order placed by a customer is an event.

A table consists of rows and columns. The relational model dictates that each row in a table should be unique so that it can be uniquely addressed through programming. The uniqueness of a row is guaranteed by defining a primary key for the table. The primary key of a table consists of one or more fields that uniquely identify a row. Each table can have only one primary key. All fields from which a primary key is drawn are called candidate keys. Multiple secondary keys can be drawn from the other fields of the table.

Secondary keys can be unique or non-unique. Secondary keys are generally used to define secondary indexes for the table in order to speed up data retrieval. Keys can be simple or composite. If a key is made up of one field, it’s called a simple key; if the key is made up of two or more fields, it’s called a composite key.

When modeling a table, you should decide which candidate keys form the primary key of the table based on business requirements; there are no hard and fast rules to define the primary key of a table. In his book titled SQL and Relational Basics, Fabian Pascal notes that such a decision should be based upon the principles of minimalism (choose the fewest fields necessary), stability (choose a key that seldom changes), and simplicity/familiarity (choose a key that is both simple and familiar to users).

For example, if you have a table to store employee details—such as employee ID, first name, last name, address, telephone number, and zip code—which of these fields will make for a good primary key? Following Pascal’s guidelines, the address and telephone number can be ruled out because they change frequently. There’s also a chance of the employee’s name changing due to marriage or other reasons; plus, names can be misspelled, and there’s a high likelihood that two employees will have the same first or last name.

Because the employee ID is unique across the organization, it makes the best candidate for the primary key. Even though there are no fixed rules for choosing a primary key, most developers prefer choosing a numeric primary key, because searches on numeric fields are more efficient than those on text fields.

Tip

In many situations, it’s recommended to use static numbers, such as employee ID, customer ID, document number, and so on, as primary keys and to avoid descriptive texts.

Foreign Keys and Domains

Even though primary keys are defined in individual tables, there may be a need to define a relationship with different tables of the database using foreign key relations. A foreign key is a field in a table that is used to reference a primary key in another table.

Let’s look at an example. Table 5.2 shows a customer table in which customer details are stored. The primary key of this table is the Cust ID column, which uniquely identifies each row.

Cust ID Name Address ZIP Phone
1 Joey 171 CE 110099 99889900
2 James 345 DE 118899 88997788
3 John 563 WE 442299 88993354

Table 5.2    Customer Table

Table 5.3 shows an orders table that stores information about customer orders. The primary key of this table is the Order ID column, which uniquely identifies each order.

Order ID Cust ID Item Qty UoM
2230 1 Pen 10 Pieces
2231 2 Pencil 10 Pieces

Table 5.3    Orders Table

In this example, the Cust ID field of Table 5.3 is considered the foreign key of the table, because it can be used to refer to the customer table (see Table 5.2).

It’s important that both the primary keys and foreign keys that are used to form a relationship share the same meaning and domain. A domain defines a possible set of values for a field.

For example, if a valid value for a customer ID can be a number between 1 and 10,000, then the Cust ID fields in both tables (Table 5.2 and Table 5.3) should adhere to this range. By implementing a foreign key check, we can also ensure that the Cust ID field in the orders table (Table 5.3) consists of only the values available in the Cust ID field of the customers table (Table 5.2).

Relationship

Foreign keys model relationships between real-world entities. Such real-world entities can have complex relationships—for example, one entity may have multiple relations with other entities. In a relational database, relationships are defined between a pair of tables.

These tables can be related in one of three ways:

Normalization

Normalization is the process of simplifying the design of a database. Normalization answers core questions regarding the number of tables required, what each table represents, the number of fields in each table, and the relationship between tables. Normalization also helps avoid redundancy and anomalies in database design.

Anomalies can occur while inserting, updating, or deleting records from a database table (if the database is poorly designed). For example, if you have a database table as shown Table 5.4, when a record is updated to change the customer address, all the rows where the customer record exists should be updated. If any rows are missed, then it will lead to data inconsistency. This is called an update anomaly.

Item_ID Customer_ID Customer_name Customer_address
1 921 Ryan 12E West End
2 921 Ryan 12E West End
3 728 John 44 Avenue Road

Table 5.4    Sample to Show Update Anomaly

If you try to insert a record for a customer that isn’t available in the customer master table, it causes an insertion anomaly. If you delete item 3 in Table 5.4, it will also delete the record of the customer John, because his details are linked to this item only. This is called a delete anomaly.

Normalization helps to overcome these anomalies and model a database that is consistent and predictable. Normalizations use the concept of normal forms to assist in designing an optimal structure. There are three available normal forms: first, second, and third.

The following subsections look at each of these normal forms.

First Normal Form

The first normal form (1NF) dictates that the field of a table should contain atomic values. The field should contain only one value. For example, the data in Table 5.5 does not conform to 1NF, because the phone number field contains multiple values for the Emp_id 2290.

Emp_id Emp_name Phone_num
2289 John 99889988
2290 Mark 88998899
99887755

Table 5.5    Employee Not Conforming to 1NF

To make Table 5.5 conform to 1NF, it should be adjusted as shown in Table 5.6 by creating a new record for the second phone number in Emp_id 2290.

Emp_id Emp_name Phone_num
2289 John 99889988
2290 Mark 88998899
2290 Mark 99887755

Table 5.6    Employee Conforming to 1NF

Second Normal Form

For a table to conform to the second normal form (2NF), it should meet the following criteria:

For example, in Table 5.7, let’s assume that Employee_ID and Department_ID are the primary key fields. According to 2NF, the non-primary fields Employee_name and Department_name should be fully dependent on the complete primary key fields Employee_ID and Department_ID, not on the subset of the primary key fields.

That’s not the case here; the Employee_name field can be identified by the Employee_ID field, and the Department_name field can be identified fully by the Department_ID field. Therefore, the non-primary fields of this table are only partially dependent on the primary key fields of the table, which is not allowed in 2NF.

Employee_ID Department_ID Employee_name Department_name
1122 01 John Sales
1123 02 Mark Finance

Table 5.7    Not Conforming to 2NF

To make Table 5.7 conform to 2NF, it should be split into two tables, as shown in Table 5.8 and Table 5.9. This ensures that there is no partial dependency on primary key fields.

Employee_ID Department_ID Employee_name
1122 01 John
1123 02 Mark

Table 5.8    Employee Table

Department_ID Department_name
01 Sales
02 Finance

Table 5.9    Department Table

Third Normal Form

A table conforms to the third normal form (3NF) if it conforms to 2NF and all non-key fields of the table are mutually independent.

For example, Table 5.10 does not conform to third normal form, because the field Total is dependent on the information in the nonkey fields Quantity and Price. To make this table conform to 3NF, it’s best to remove the Total field from the table and handle the calculations in a query or report. This ensures that updating one of the fields of the table does not result in any anomalies.

Item_id Quantity Price Total
1 10 10 100
2 20 15 300

Table 5.10    Conforming to 3NF

Every higher normal form is a superset of all lower normal forms. For example, if you design for 3NF, then by default it conforms to 1NF and 2NF. Normalization helps achieve optimal database design. A good database design takes time and effort and requires a good understanding of business requirements. A good design not only makes your applications efficient, but also prevents headaches down the line.

SAP HANA

There has been a growing need for real-time analytics over the years, which requires processing huge sets of data. Traditional database systems form data processing speed bottlenecks for large data volumes. SAP addressed this problem with its in-memory database system called SAP HANA.

SAP HANA has transformed the relational database industry with its innovations in hardware and software technology to process massive datasets in real time using in-memory computing. SAP HANA combines database, application processing, and integration services on a single platform.

SAP HANA stores compressed data in memory in a columnar format that enables faster scanning and quicker processing. Compared to traditional databases, programs take minutes to process the data from SAP HANA, as opposed to hours when using a traditional database.

The following are some important features to consider with SAP HANA when it comes to ABAP:

5.3.2    Selecting Data from Database Tables

You can use internal tables to process data from a database table, allowing you to store multiple rows of data at once. In this section, we’ll explore a few Open SQL statements that can be used to process data from a database.

The SELECT keyword fetches data from a database table. You can select a single record or multiple records, and you can select the data from the database into a structure or an internal table of your ABAP program. You can either select the complete row (all fields) of a database table or select only specific fields.

UNION Addition

As of SAP NetWeaver 7.5, the UNION addition can be used to create a union between the result sets of two SELECT statements.

When selecting data from a database, the row of the data object should be identical to the row of the database table. To ensure this, refer your data object to the table when selecting the complete row. If you need to select only certain fields from the row, make sure to define the fields in the same order in which you plan to select them. The SELECT statement supports an INTO CORRESPONDING FIELDS addition if the order of fields in your data object doesn’t match the order of the selection.

Listing 5.23 shows the usage of a SELECT SINGLE statement to select a single row from the database table. If more than one row satisfies the WHERE clause, the first matching row is fetched.

TYPES: BEGIN OF ty_vbrk, 
VBELN TYPE VBELN_VF, "Document Number
FKDAT TYPE FKDAT, "Bill Date
NETWR TYPE NETWR, "Net Value
KUNRG TYPE KUNRG, "Payer
END OF ty_vbrk.

DATA : it_vbrk TYPE STANDARD TABLE OF ty_vbrk,
wa_vbrk TYPE ty_vbrk.

*Selecting a single row of specific fields from table vbrk.
SELECT SINGLE vbeln fkdat netwr kunrg FROM vbrk
INTO wa_vbrk
WHERE vbeln EQ '9000'.

Listing 5.23    Using SELECT SINGLE

Listing 5.24 uses SELECT…ENDSELECT to fetch all matching rows, one row at a time. The system loops between SELECT…ENDSELECT, transferring one row at a time to the structure wa_vbrk. You can process the contents of wa_vbrk in each loop by writing the statements before ENDSELECT. This is an efficient way to process the records from the database table if you plan to use a database table only once during program execution.

However, if you plan to reuse the data multiple times during the life of the program execution, we recommend using internal tables, which allows you to fetch the data once into your program and access it subsequently multiple times, as opposed to accessing the database every time you need the same data.

*Selecting all matching rows of specific fields from the table 
*vbrk using a structure.
*Here the system fetches one row at a time and loops between
*select…endselect. The row can be processed between SELECT…ENDSELECT.
SELECT vbeln fkdat netwr kunrg FROM vbrk
INTO wa_vbrk
WHERE vbeln EQ '9000'.

ENDSELECT.

Listing 5.24    Using SELECT…ENDSELECT

Listing 5.25 shows using an internal table in a SELECT statement, also known as performing an array fetch. Once the SELECT statement is executed, you can access the data from the internal table as discussed previously.

*Selecting all matching rows of specific fields from the table 
*vbrk using an internal table.
SELECT vbeln fkdat netwr kunrg FROM vbrk
INTO TABLE it_vbrk
WHERE vbeln EQ '9000'.

Listing 5.25    Selecting Data to Internal Table

Listing 5.26 shows selecting the fields of the complete table row. Notice the difference in data object declaration and the usage of * as the SELECT statement.

DATA : it_vbrk TYPE STANDARD TABLE OF vbrk,
wa_vbrk TYPE vbrk.

*Selecting a single row of all fields from table vbrk.
SELECT SINGLE * FROM vbrk
INTO wa_vbrk
WHERE vbeln EQ '9000'.
*Selecting all matching rows of all fields from table vbrk
*using a structure.
SELECT * FROM vbrk
INTO wa_vbrk
WHERE vbeln EQ '9000'.
ENDSELECT.
*Selecting all matching rows of all fields from table vbrk
*using an internal table.
SELECT * FROM vbrk
INTO TABLE it_vbrk
WHERE vbeln EQ '9000'.

Listing 5.26    Selecting All Fields of a Row

5.3.3    Selecting Data from Multiple Tables

The syntax we’ve discussed so far allows you to select data from a single table. However, in real-world applications, data is spread across multiple tables with foreign keys.

To select data from tables that are in a foreign key relationship, you can use joins or use the FOR ALL ENTRIES addition with the SELECT statement. The FOR ALL ENTRIES addition allows you to select the data from the database table and compare it to the data in an internal table that was previously selected from another database table. This is efficient if you’re processing data from two tables.

Listing 5.27 shows the syntax to use for all entries. Here, the data is first selected from database table VBRK into an internal table IT_VBRK; then the data in IT_VBRK is compared to select the data from database table VBRP.

When using FOR ALL ENTRIES, make sure the internal table used for comparison isn’t empty. Otherwise, the system will execute the SELECT statement without restriction. Notice the IF condition used in the listing.

SELECT * FROM vbrk
INTO TABLE it_vbrk
WHERE vbeln EQ p_vbeln.
IF sy-subrc IS INITIAL.
SELECT * FROM vbrp
INTO TABLE it_vbrp
FOR ALL ENTRIES IN it_vbrk
WHERE vbeln EQ it_vbrk-vbeln.
ENDIF.

Listing 5.27    Using FOR ALL ENTRIES

Alternately, joins allow you to select data from up to twenty-five database tables. There are two types of joins:

Listing 5.28 highlights the difference between an inner and outer join. If the internal table has three fields—FLD1, FLD2, and FLD3—the result for an inner join and outer join are shown in Listing 5.28.

Table A
Table B
Field 1
Field 2
Field 1
Field 3
X
100
X
ABC
Y
200
X
DEF
Z
300
Z
WXY


Inner Join:
-----------
FLD1 FLD2 FLD3
--------------------
X 100 ABC
X 100 DEF
Z 300 WXY


Outer Join:
-----------
FLD1 FLD2 FLD3
--------------------
X 100 ABC
X 100 DEF
Y 200 "No entry for Key Y in table B so FLD3 is blank
Z 300 GHI

Listing 5.28    Inner Join vs. Outer Join

Listing 5.29 shows an example using an inner join. Here, we join the columns carrname, connid, and fldate of the database tables scarr, spfli, and sflight using two inner joins. This creates a list of flights from p_cityfr to p_cityto. An alias name is assigned to each table.

PARAMETERS: p_cityfr TYPE spfli-cityfrom,
p_cityto TYPE spfli-cityto.

TYPES: BEGIN OF wa,
fldate TYPE sflight-fldate,
carrname TYPE scarr-carrname,
connid TYPE spfli-connid,
END OF wa.

DATA itab TYPE SORTED TABLE OF wa
WITH UNIQUE KEY fldate carrname connid.

SELECT c~carrname p~connid f~fldate
FROM ( ( scarr AS c
INNER JOIN spfli AS p ON p~carrid = c~carrid
AND p~cityfrom = p_cityfr
AND p~cityto = p_cityto )
INNER JOIN sflight AS f ON f~carrid = p~carrid
AND f~connid = p~connid )
INTO CORRESPONDING FIELDS OF TABLE itab.

Listing 5.29    Inner Join Example

Listing 5.30 shows an example using an outer join.

PARAMETERS p_cityfr TYPE spfli-cityfrom.

TYPES: BEGIN OF wa,
carrid TYPE scarr-carrid,
carrname TYPE scarr-carrname,
connid TYPE spfli-connid,
END OF wa.
DATA itab TYPE SORTED TABLE OF wa
WITH NON-UNIQUE KEY carrid.
START-OF-SELECTION.
SELECT s~carrid s~carrname p~connid
FROM scarr AS s
LEFT OUTER JOIN spfli AS p ON s~carrid = p~carrid
AND p~cityfrom = p_cityfr
INTO CORRESPONDING FIELDS OF TABLE itab.

Listing 5.30    Outer Join Example

Using joins may seem a bit overwhelming initially, so we suggest you practice using FOR ALL ENTRIES and use joins only if you need data from many tables.