Key Concepts Refresher

In the preceding chapters, we’ve discussed many performance enhancements that are built into the design of SAP HANA; in this chapter, we’ll walk through various optimization topics in SAP HANA.

Architecture and Performance

The largest and most fundamental change that comes with SAP HANA technology is the move from disks to memory. The in-memory technology that SAP HANA is built on has been one of the biggest contributors of performance improvements. The decision to break with the legacy of disk-based systems and use in-memory systems as the central focus point (and everything that comes with this) has led to significant changes in performance.

The next step was to combine memory and multicore CPUs. If we store everything in memory and have plenty of processing power, we do not have to store calculated results. We just recalculate them when we need them—which opens the door to real-time computing. Real-time computing requires that results be recalculated each time they’re needed, because the data might have changed since the last query.

This combination of memory and CPU processing power led to new innovations. Now, you can combine OLTP and OLAP together in the same system: The OLAP components don’t need the storage of precalculated cubes and can be recalculated without impacting the performance of the OLTP system.

Another innovations prompted by combining memory and CPU is data compression. This both reduces memory requirements and helps to better utilize CPUs by loading more data in the level 1 CPU cache. Loading more data into CPUs once again improved performance.

To improve compression, SAP HANA prefers columnar storage. Along with many advantages, columnar storage also has a few disadvantages: Inserting new records and updating existing records are not handled as efficiently in column tables as in row tables. To solve this, SAP HANA implemented the insert-only principle with delta buffers. We use row storage for delta buffers to improve insert speeds and then use delta merges to bring the data back to columnar storage, which leads to better read performance.

[+] Tip

The performance improvements in SAP HANA do not mean that we can be negligent in modeling. For example, you should not use SELECT * statements or choose any unnecessary fields in queries, because doing so decreases the efficiency of the column storage.

Using multicore processors, everything in SAP HANA was designed from the ground up to run in parallel. Even a single aggregation value is calculated using parallelism. This led to SAP HANA growing from a single hardware appliance to a distributed scale-out system that can work across multiple server nodes and finally to multitenant database containers. Tables can now also be partitioned across multiple servers. With more servers, the scale and performance of SAP HANA systems has vastly advanced.

Redesigned and Optimized Applications

While combining memory and processors, we also saw that moving data from memory to application servers is not the optimal way to work with an in-memory database. It makes more sense to push down some of the application logic to the SAP HANA system for fast access to the data in memory and plenty of computing resources for calculations.

SAP started by making SAP HANA available as an accelerator and sidecar solution to improve issues with long-running processes in current systems. Over time, SAP business applications were migrated to SAP HANA and re-engineered to make use of the advantages it offers.

With SAP S/4HANA, we now have a system that uses views instead of materialized aggregates in lookup tables, significantly reducing the size of the system, simplifying the code, and enriching the user experience.

Information Modeling Techniques

In previous chapters, we discussed some techniques that can help optimize information models, such as the following:

If you use these techniques, you will avoid many optimization hazards. These techniques should be part of your everyday workflow.

[+] Tip

The best optimization is no optimization. If you avoid the pitfalls in this chapter, you will not need to optimize later.

In spite of your best efforts, sometimes you need to optimize an information model.

Optimization Tools

When optimizing an SAP HANA information model, a number of tools are at your disposal. You need to know when to pick which optimization tool and how to use it.

SAP HANA Engines

SAP HANA has several internal components called engines. Each of these engines is optimized to provide specific functionality for the processing of SAP HANA information models:

These engines work together to process information models. However, transferring information from one engine to another can affect performance.

[+] Tip

Reduce the amount of data transferred between the engines.

A better idea of how this process between engine works can be seen when using a star join view with calculated columns. In this case, SAP HANA might use the OLAP engine for processing the star join view and the calculation engine for processing the calculated columns. If you plan to use a calculation view of type cube on top of the star join view later, it might make sense to move the calculated columns to the calculation view of type cube. This is in line with our general recommendation to calculate as late as possible.

[»] Note

As SAP HANA improves and matures, there will be less emphasis on the engines and the transfer between engines. With SAP HANA now providing a migration tool to help move to calculation views, this implies less emphasis on knowing about and compensating for the transfer between engines.

Now that you have a better understanding of SAP HANA’s engines, in the next section we’ll begin discussing the tools available for optimizing performance.

Explain Plan

One of the purposes of the Explain Plan tool is to show which engines are used when processing an information model and if there is a transfer between engines involved. It also shows what operators are processed—for example, a join, a column table, a column search, or a filter. This tool does not provide any information about the actual processing, such as how many records were processed or how long it took.

You can access the Explain Plan tool from the context menu of the SQL Console.

[+] Tip

There is a quick way to reach any SAP HANA view from the SQL Console window: Right-click the name of the view in the Content area and select the Generate Select SQL option from the context menu. This will open a SQL editor with an equivalent SQL query for the selected information view.

Figure 10.1 shows the Purchase Overview calculation view, opened from the SHINE content. From the context menu in the SQL Console, choose the Explain Plan option to launch the tool for this SQL query.

Figure 10.1   Open Explain Plan from Context Menu

Figure 10.1 Open Explain Plan from Context Menu

Figure 10.2 shows the Explain Plan with the SQL query on the top and the execution engines and the names of the operators used when running the SQL query. In the Operator Details column, you can see which tables are used here.

Figure 10.2   Explain Plan

Figure 10.2 Explain Plan

Visualize Plan

From the same context menu in the SQL Console, you can also launch the Visualize Plan tool. The Visualize Plan tool has two options, as shown in Figure 10.3: Prepare and Execute.

Figure 10.3   Visualize Plan Context Menu Options in SQL Console

Figure 10.3 Visualize Plan Context Menu Options in SQL Console

The Visualize Plan tool visually shows how the SQL query will be prepared and executed. Figure 10.4 illustrates the prepared execution plan. The bottom row shows the various column tables that will be read, and the lines going to the next block up (a join node) show how many records will be read from each table. The join node output again shows the number of rows.

Figure 10.4   Prepared Execution PlanExecution plan Visualized

Figure 10.4 Prepared Execution Plan Visualized

This information is helpful when you want to reduce the data transferred between layers as quickly as possible or filter as early as possible, as recommended. You can view the prepared plan and find out how many records will be generated in the result set. Then, you can apply a filter, for example, and rerun the prepared plan to see the difference.

Once you’ve viewed the prepared execution plan, you can execute the SAP HANA information model and call the Visualize Plan tool for the executed plan. You will see the Overview tab (Figure 10.5).

Figure 10.5   Executed Plan Overview Tab

Figure 10.5 Executed Plan Overview Tab

From the Overview tab, you can see how long the query took, how much memory it used, the number of tables that were used, and how many records were returned in the result set.

Next to the Overview tab is the Executed Plan tab, which shows the executed plan visually, with the execution times for each operator block. You can expand an operator block by clicking on the small triangle at the top-right of the block (see Figure 10.6). Here, you can see the execution time for every block and how many records are passed between the operations.

Figure 10.6   Executed Plan Visualization and Timeline View

Figure 10.6 Executed Plan Visualization and Timeline View

Figure 10.6 also shows the timeline view, which presents timelines for various operations. The top and bottom portions of the screen are linked, so if you select a line on the timeline view, the corresponding operations block in the top graph is selected as well.

Next to the Timeline tab, you’ll see the Operator list tab (Figure 10.7), which provides more detail on every operator used when executing the query.

Figure 10.7   Executed Plan Visualization and Operator List

Figure 10.7 Executed Plan Visualization and Operator List

As you can see, the Visualize Plan is a powerful tool that can help you optimize SAP HANA information models.

Administration Console

We briefly mentioned the Administration Console in Chapter 5. There are three tabs that are especially relevant to optimization:

Performance Analysis Mode

The last tool that we will look at in this chapter is Performance Analysis Mode, available from the top-right toolbar when you build SAP HANA information views.

You can switch this mode on or off when designing information views. The tool will analyze your view and suggest possible improvements.

Figure 10.11 shows the dropdown menu for Performance Analysis Mode. The first menu option allows you to switch this mode on or off.

Figure 10.11   Performance Analysis Mode Menu

Figure 10.11 Performance Analysis Mode Menu

Before we discuss the main features of Performance Analysis Mode, let’s briefly detour to look at the last option on the menu shown in Figure 10.11. The Visualize View in Planviz Editor option opens a simplified graphical flow of the information view (Figure 10.12).

When you select a node, you can see the attributes and measures on the left of the graphical flow and the input and output fields on the right.

Figure 10.12   Visualize View Called from Performance Analysis Mode Menu

Figure 10.12 Visualize View Called from Performance Analysis Mode Menu

Go back to the Performance Analysis Mode and you will notice that a new tab, Performance Analysis, has been added to the View Editor in which you design the information view. When you select a node—for example, the join node—in your view, and then then open the Performance Analysis tab (top middle), you will now see improvement recommendations from the SAP HANA system.

Figure 10.13 returns to the Purchase Overview calculation view from the SHINE content.

Figure 10.13   Performance Analysis Mode Optimization Suggestions

Figure 10.13 Performance Analysis Mode Optimization Suggestions

As shown in Figure 10.13, we’ve highlighted three possible improvement suggestions:

Figure 10.14 shows additional information for when you’re joining tables from remote sources using SAP HANA smart data access (SDA). (We will discuss SDA in Chapter 14.)

Figure 10.14   Performance Analysis Mode Suggestions to Optimize SAP HANA Information View Built on Virtual Tables

Figure 10.14 Performance Analysis Mode Suggestions to Optimize SAP HANA Information View Built on Virtual Tables

You can set validation rules related to performance in the SAP HANA studio preferences. From the Window menu, select Preferences · SAP HANA · Modeler · Validation Rules. Figure 10.15 shows validation rules related to performance.

Figure 10.15   Validation Rules Related to Performance in SAP HANA Studio

Figure 10.15 Validation Rules Related to Performance in SAP HANA Studio

Best Practices for Optimization

Performance Analysis Mode suggested some improvements we could make to our information view with regards to specifying the cardinality, using a referential join, and potentially partitioning our tables. In this last section, we will discuss some additional best practices for modeling in SAP HANA. Remember that we mentioned a few earlier in the chapter when we summarized concepts you’ve learned previously in the book.

We can divide these best practices into two areas—namely, those related to the graphical view design and those related to SQL and SQLScript.

Graphic Information Modeling Best Practices

The following are some additional best practices to optimize graphical information models:

[+] Tip

Remember that SAP HANA uses authorizations as a filter. If a user is only allowed to see one cost center, then SAP HANA will use that as a filter to speed up the query. In this case, security is not an overhead as in many other systems but an optimization technique.

SQL and SQLScript Guidelines

We mentioned several best practices for optimizing SQL and SQLScript in Chapter 8, including the following:

[+] Tip

For more optimization and performance tips, watch Werner Steyn’s excellent one-hour session, named DMM208, from SAP TechEd 2015 at http://events.sap.com/teched/en/session/26543.

You can also attend the HA215 training course. You can find more information about this course at https://training.sap.com/shop/course/ha215-sap-hana-monitoring-and-performance-tuning-classroom-010-g-en/.