SAP PRESS :: SAP HANA Certification Guide - 0 Architecture and Deployment Scenarios

Key Concepts Refresher

We’ll start with something that most people know very well: Moore’s Law. Even if you have never heard of it, you have seen its effects on the technology around you. Put simply, this means that the overall processing power for computers doubles about every two years. Therefore, you can buy a computer for roughly the same price as about two years ago, and it will be twice as fast as the older one.

Moore’s Law is useful for understanding system performance and response times. If you have a report that is running slow, let’s say an hour long, you can fix this by buying faster hardware. You can easily get the processing time of your slow report from 1 hour to 30 minutes to 15 minutes by just waiting a while and then buying faster computers. However, a few years ago that simple recipe stopped working. Our easy upgrade path was no longer available, and we had to approach the performance issues in new and innovate ways. Data volumes also keep growing, adding to the problem. SAP realized these problems early, and worked on the solution we now know as SAP HANA.

In this chapter, we will discuss SAP HANA’s in-memory technology and how it addresses the aforementioned issues. We’ll then look at SAP HANA’s architecture and new approaches. Finally, we’ll walk through the various deployment options available, including cloud options.

In-Memory Technology

SAP HANA uses in-memory technology. But, just what is in-memory technology?

In-memory technology is an approach to querying data residing in RAM, as opposed to physical disks. The results of doing so lead to shortened response times. This enables enhanced analytics for business intelligence/analytic applications.

In this section, we’ll walk through the process that lead us to use in-memory technology, which will illustrate the importance in new technology.

Multicore CPU

When CPU speed reached about 3 GHz, we seemed to hit a wall. It became difficult to make a single CPU faster while keeping it cool enough. Therefore, chip manufacturers moved from a single CPU per chip to multiple CPU cores per chip.

Instead of increasingly fast machines, machines now stay at roughly the same speed but have more cores. If a specific report only runs on a single core, that report will not run any faster thanks to more CPU cores. We can run multiple reports simultaneously, but all of them will run for an hour. To speed up reports, we need to rewrite software to take full advantage of new multicore CPU technology, a problem that has been faced by most software companies.

SAP addressed this problem in SAP HANA by making sure that all operations are making full use of all available CPU cores.

Slow Disk

When talking about poor system performance, you can place a lot of blame on slow hard disks. Table 3.1 shows how slow a disk can be compared to memory.

Operation	Latency (In Nanoseconds)	Relative to Memory
Read from level 1 (L1) cache	0.5 ns	200 x faster
Read from level 2 (L2) cache	7 ns	14 x faster
Read from level 3 (L3) cache	15 ns	6.7 x faster
Read from memory	100 ns	–
Send 1 KB over 1 Gbps network	10,000 ns	100 x slower
Read from solid state disk (SSD)	150,000 ns	1,500 x slower
Read from disk	10,000,000 ns	100,000 x slower
Send network packet from the US West Coast to Europe and back	150,000,000 ns	1,500,000 x slower

Table 3.1 Latency of Hardware Components and Performance of Components
Shown Relative to RAM Memory

To give you an idea of how slow a disk really is, think about this: If reading data from RAM memory takes one minute, it will take more than two months to read the same data from disk! In reality, there is less of a difference when we start reading and streaming data sequentially from disks, but this example prompts the idea that there must be better ways of writing systems.

The classic problem is that alternatives like RAM memory have been prohibitively expensive, and there has not been enough RAM memory available to replace disks.

Therefore, the way we have designed and built software systems for the last 30 or more years has been heavily influenced by the fact that disks are slow. This affects the way everyone from technical architects to developers to auditors thinks about system and application design. For example:

SAP NetWeaver architecture uses one database server and multiple application servers (Figure 3.1). Because the database stores data on disk and disks are slow, the database becomes the bottleneck in the system. In any system, you always try to minimize the impact of the bottlenecks and the load on it. Therefore, we move the data into multiple application servers to work with it there, using the database server as little as possible. In SAP systems, we use it so little that we can be database independent. SAP systems even have their own independent ABAP data dictionaries.

Figure 3.1 SAP NetWeaver ArchitectureSAP NetWeaverold architecture: One Database Server, Many Application Servers

Figure 3.1 SAP NetWeaver Architecture : One Database Server, Many Application Servers

In the application servers, we run all the code, perform all the calculations, and execute the business rules. We use memory buffers to store a working set of data. In many SAP systems, 99.5% of all database reads come from these memory buffers, thus minimizing the effects of the slow disk. SAP system architecture was heavily influenced by the fact that disk is slow.

Inside financial systems, we store data such as year-to-date data, month-to-date data, balances, and aggregates. For reporting, we have cubes, dimensions, and more aggregates. These values are all updated and stored in financial systems to improve performance.

Take the case of the year-to-date figure. We do not really need to store this information, because we can calculate it from the data. The problem is that to calculate the year-to-date figure, we need to read millions of records from the database, which the database reads from the slow disk. Therefore, we could calculate the current year-to-date figure, but doing so would be too slow. Because disk storage is cheap and plentiful, we just add this value into the database in another table. With every new transaction, we simply add the new value to the current year-to-date value and update the database. Now reports can run a lot faster on disk.

If the disk was fast, we would not need to store these values. Can you imagine a financial system without these values? Ask accountants to imagine a financial system without balances! Why store balances when you can calculate them? When you see the reaction on their faces, you will understand how even their thinking has been influenced by the effects of slow disks.

We will return to this example in the next section.

One of the many reasons we have data warehouses is that we need fast and flexible reporting. We build large cubes with precalculated, stored aggregates that we can use for reporting and to slice and dice the data. Building these cubes takes time and processing power. Reading the masses of data from disk for the reports can be slow. If we keep these cubes inside, for example, our financial system, then a single user running a large report from these cubes can pull the system’s performance down. That one user affects thousands of users.
Therefore, SAP created separate data warehouses to reduce the impact of reporting on operational and transactional systems. Those reporting users can now run their heavy, slow reports away from the operational and transactional systems. The database world now is split into two parts: The online transactional processing (OLTP) databases for the financial, transaction, and operations systems, and the online analytical processing (OLAP) data warehouses (see Figure 3.2).

Figure 3.2 OLTP vs. OLAP : Operational Reporting Seperate from Operational Systems for Performance Reasons

If disks were fast and we could create aggregates fast enough, we would be able to perform most of our operational reporting directly in the operational system, where the users expect and need it. We would not have separate OLTP and OLAP systems. Again, slow disks influenced architectural design principles.

These three examples prove that slow disks have influenced SAP system design, architecture, and thinking for many years now. If we could eliminate the slow disk bottleneck from our systems, we could then change our minds about how to design and implement systems. Hence, SAP HANA represents a paradigm shift.

Loading Data into Memory

In the first of the three examples in the previous section, you saw that we have used memory where possible to speed things up with memory caching. Nevertheless, until a few years ago, memory was still very expensive. Then, memory prices started coming down fast, and we asked, “Can we load all data into memory instead of just storing it on disk?”

Customer databases for business systems can easily be 10 TB (10,000 GB) in size. The largest available hardware when SAP HANA was developed had only 512 GB to 1 TB of memory, presenting a problem. SAP obviously needed to compress data to fit it into the available memory. We’re not talking here about ZIP or RAR compression, which needs to be decompressed again before it is usable, but dictionary compression . Dictionary compression involves mapping distinct values to consecutive numbers.

Column-Oriented Database

For years, it was common to store data in rows. Row-based databases work well with spinning disks to read single records faster. However, when we start storing data in memory, there is no longer a valid reason to use row-based storage. At that point, we have to decide to use row-based or column-based storage.

Why would we use columns instead of rows? If we start looking at compressing data, then row-oriented data does not work particularly well. This is because we mix different types of data—like cities with salaries and employee numbers—in every record. The average compression we get is about two times (that is, data is stored in about 50% of the original space).

When we store the same data in columns, we now group similar data together. By using dictionary compression, we can achieve excellent compression, and thus column-based storage starts making more sense for in-memory databases.

Another way in which we can save space is via indices. If we look inside an SAP system, we find that about half the data volume is taken up by indices. An index is a sorted column; if we store the data in columns in memory, do we really need indices? We can save that space. With such column-oriented database advantages, we end up with 5 to 20 times compression.

Back to our original problem: 10 TB databases, but machines with only 1 TB of memory. By using column-based storage, compression, and no indices, we can fit a 10 TB database into the 1 TB of available memory. Even though we focused on the column-oriented concept here, note that SAP HANA can handle both row- and column-based tables .

Removing Bottlenecks

When we manage to load the database entirely into memory, exciting things start happening. We break free from the slow disk bottleneck that had a stranglehold on system performance for many years. The architecture is designed to work around this problem.

As always, even if our systems become thousands of times faster, at some stage we will reach new constraints. These constraints will determine what the new architecture will look like. The way we provide new business solutions will depend on the characteristics of this new architecture.

Architecture and Approach

The new in-memory computing paradigm has changed the way we approach solutions. Different architectures and deployment options are now possible with SAP HANA.

Changing from a Database to a Platform

The biggest change in approach is how and where we work with the data. Disks were always slow, causing the database to become a bottleneck . Thus, we moved the processing of data to the application servers to minimize the pressure on this disk bottleneck.

Now, data storage has moved from disk to memory, which is much faster. Therefore, we needed to reexamine our assumptions about the database being the bottleneck. In this new in-memory architecture, it makes no sense to move data from the memory of the database server to the memory of the application servers via a relatively slow network. In that case, the network will become the bottleneck, and we will duplicate functionality.

New bottlenecks require new architectures. Our new bottleneck arises because we have many CPU cores that need to be fed as quickly as possible from memory. Therefore, we put the CPUs close to the memory (i.e., the data). In doing so, the first point that we realize is that SAP HANA can be sold as an appliance. As shown in Figure 3.3, it doesn’t make sense to copy data from the memory of the database server to the memory of the application server as we have always done (left arrow). We instead now move the code to execute within the SAP HANA server (right arrow).

Figure 3.3 Moving Data In-Memory

Our solution approach also needed to change. Instead of moving data away from the database to the application servers, as we have done for many years, we now want to keep it in the database. Many database administrators have done this for years via stored procedures . Instead of stored procedures, in SAP HANA we use graphical data models , as we will discuss in Chapter 4. You can also use stored procedures and table functions if you need to do so. We will look at this process in Chapter 8.

[»] Note

Instead of moving data to the code, we now move the code to the data. We can implement this process by way of SAP HANA graphical modeling techniques.

When you start running code inside the database, it is only a matter of time before it changes from a database into a platform. When people tell you that SAP HANA is “just another database,” ask them which other database incorporates more than five programming languages, a web server, an application server, an HTML5 framework, a development environment, replication and cleaning of data, integration with most other databases, predictive analytics, and multitenancy; the reactions can be interesting.

Parallelism in SAP HANA

In the Slow Disk section, we provided three examples of the wide-ranging effects of slow disks on performance. In the second example, we looked at calculating the year-to-date value. We do not really need to store a year-to-date value because we can calculate it from the data. With slow disks, the problem is that to calculate this value we need to read millions of records from the database, which the database reads from the slow disk. Because disk storage is cheap and plentiful, we just added this aggregate into the database in another table.

When we have millions of records in memory and many CPU cores available, we can calculate the year-to-date value faster than an old system can read it from the new table from a slow disk. Look at Figure 3.4: Even though we require a single year-to-date value, we use memory and all the available CPU cores in parallel to calculate it. We just divide the memory containing the millions of records into several ranges and ask each available CPU core to calculate the value for its memory range. Finally, we add all the answers from each of the CPU cores to get the final value. Even a single sum is run in parallel inside SAP HANA.

Figure 3.4 Single Year-to-Date Value Calculated in Parallel Using All Available CPU Cores

Although it’s relatively easy to understand this example, don’t overlook the important consequences of this process: If we can calculate the value quickly, we do not need to store it in another table. By applying this new approach, we start to get rid of many tables in the system. This then leads to simpler systems. We also have to write less code, because we do not need to maintain a particular table, insert or update the data in that table, and so on. Our backups become smaller and faster because we do not store unnecessary data. That is exactly what has happened with SAP S/4HANA : Functional areas that had 37 tables, for example, now have two or three tables remaining.

To get an idea of the compound effect that SAP HANA has on the system, Figure 3.5 shows how a system shrinks when you start applying this new approach.

Figure 3.5 Database Shrinks with SAP HANAIn-memory In-Memory Techniques

Figure 3.5 Database Shrinks with SAP HANA In-Memory Techniques

Say that we start with an original database of 750 GB. Because of column-based storage, compression, and the elimination of indices, the database size easily comes down to 150 GB. When we then apply the techniques we just discussed for calculating aggregates instead of storing them in separate tables, the database goes down to about 50 GB. When we finally apply data aging , the database shrinks to an impressive 10 GB. In addition, we are left with less code, faster reporting response times, and smaller backups.

Delta Buffer and Delta Merge

Before we start sounding like overeager sales people, let’s look at something that column-based databases are not good at.

Row-based databases are good at reading, inserting, and updating single records. Row-based databases are not efficient when we have to add new columns (fields).

When we look at column-based databases , we find that these two sentences are reversed. This is because data stored in columns instead of in rows is effectively “rotated” by 90 degrees. Adding new columns in a column-based database is easy and straightforward, but adding new records can be a problem. In short, column-based databases tend to be very good at reading data, whereas row-based databases are better at writing data.

Fortunately, SAP took a pragmatic rather than a purist approach and implemented both column-based and row-based storage in SAP HANA. This allows users to have the best of both worlds: Column storage for compressed data and fast reading and row storage for fast writing of data. Because reading data happens more often than writing, the column-based store is obviously preferred.

Figure 3.6 shows how this is implemented in SAP HANA table data storage. The main table area is using the column store as we just discussed. The data in here is optimized for fast reading.

Figure 3.6 Column and Row Storage Used to Achieve Fast Reading and Writing of Data

Next to the main table area, we also have an area called the delta buffer , sometimes called the delta store . When SAP HANA needs to insert new records, it inserts them into this area. This is based on the row store and is optimized for fast data writing. Records are appended at the end of the delta buffer. The data in this delta buffer is therefore unsorted.

Some people believe the read queries are slowed down when the system has to read both areas (for old and newly inserted data) to answer a query. Remember that everything in SAP HANA runs in parallel and uses all available CPU cores. For the query shown in Figure 3.6, 100 CPU cores can read the main table area and another 20 CPU cores can read the buffer area. The cores then combine all the data read to produce the result set.

We have looked at cases in which new data is inserted—but what about cases in which data needs to be updated? Column-based databases can also be relatively slow during updates. Therefore, in such a case, SAP HANA marks the old record in the main storage as “updated at date and time” and inserts the new values of the records in the delta buffer. The SAP HANA system effectively changes an update into a flag and insert. This is called the insert-only technique.

When the delta buffer becomes too full or when you trigger it, the delta merge process takes the values of the delta buffer and merges them into the main table area. SAP HANA then starts with a clean delta buffer.

Column-based databases also do not handle SELECT * SQL statements well. We will look at this in more detail in Chapter 10. For now, we encourage you to avoid such statements in column-based databases.

In the next section, we will look at how the SAP HANA in-memory database persists its data, even when we switch off the machine.

Persistence Layer

The most frequently asked question people hear about SAP HANA and its in-memory concept is as follows: “Memory is volatile. That is, it loses its contents when you switch off its power source. So, what happens to SAP HANA’s data when you lose power or switch off the machine?”

The short answer is that SAP HANA still uses disks. However, whereas other databases are disk-based and optionally load data into memory, SAP HANA is in-memory and merely uses disks to back up the data for when the power is switched off. The focus is totally different.

The disk storage and the storing of data happens in the persistence layer , and storing data on disk is called persistence .

Let’s look at how SAP HANA uses disks and the persistence layer in the following subsections.

SAP HANA as a Distributed Database

Just like many newer databases, SAP HANA can be deployed as a distributed database . One of the advantages that SAP HANA has due to its in-memory architecture is that it is an ACID-compliant database . ACID (atomicity, consistency, isolation, and durability) ensures that database transactions are reliably processed. Most other distributed databases are disk-based, with the associated performance limitations, and most of them—most notably NoSQL databases—are not ACID-compliant. NoSQL databases use eventual consistency, meaning that they cannot guarantee consistent answers to database queries. They might come back with different answers to the same query on the same data. SAP HANA will always give consistent answers to database queries. This detail is vital for financial and business systems.

Figure 3.7 shows an instance in which SAP HANA is running on three servers. A fourth server can be added to facilitate high availability . Should any one of the three active servers fail, the fourth standby server will instantly take over the functions of the failed server.

Figure 3.7 SAP HANA as a Distributed Database

SAP HANA can partition tables across active servers. This means that we can take a table and split it to run on all three servers. This way, we don’t only use multiple CPU cores of one server with SAP HANA to execute queries in parallel, but also use multiple servers. We can partition data in various ways—for example, people whose surnames start with the letters A–M can be in one partition N–Z in another.

In the case of SAP HANA as a distributed database, we use shared disk storage to facilitate high availability . This configuration of SAP HANA is called the scale-out solution.

Data Volume and Data Backups

Figure 3.8 shows the various ways that SAP HANA uses disks. At the top left, you can see that the SAP HANA database is in memory. If it is not in memory, it is not a database. The primary storage for transactions is the memory. The disk is used for secondary storage —to help when starting up after a power down or hardware failure, with backups, and with disaster recovery.

Figure 3.8 Persistence of DataDatapersistence with SAP HANA

Figure 3.8 Persistence of Data with SAP HANA

Data is stored as blocks (also sometimes called pages ) in memory. Every 10 minutes, a process called a savepoint runs that synchronizes these memory blocks to disks in the data volume . Only the blocks that contain data that have changed in the last 10 minutes are written to the data volume. This process is controlled by the page manager . The savepoint process runs asynchronously in the background.

Data backups are made from the disks in the data volume. When you request a backup, SAP HANA first runs a savepoint and ensures that the data in the data volume is consistent, then starts the backup process.

[»] Note

The backup reads the data from the disks in the data volume. This does not affect the database operations, because the SAP HANA database continues working from memory. Therefore, you do not have to wait for an after-hours time window to start a backup.

SAP HANA can unload (drop) data blocks from memory when it needs more memory for certain operations. This unloading of data from memory is not a problem, because SAP HANA knows that all the data is also safely stored in the data volume . It will unload the least-used portions of the data. It does not unload the least-used table, but only the columns or even portions of a column that were not used recently. In a scale-out solution, it can unload parts of a table partition.

Log Volume and Log Backups

In the middle section of Figure 3.8, you can see the transaction manager writing to the log volume . The page manager and transaction manager are subprocesses of the SAP HANA index server process. This is the main SAP HANA process. Should this process stop, then SAP HANA stops.

When you edit data in SAP HANA, the transaction manager manages these changes. All changes are stored in a log buffer in memory. When you COMMIT the transaction, the transaction manager does not commit the transaction to memory unless it is also committed to the disks in the log volume, so this is a synchronous operation. Because the system has to wait for a reply from a disk in this case, fast SSD disks are used in the log volume. This synchronous writing operation also is triggered if the log buffer becomes full.

The log files in the log volume are backed up regularly, using scheduled log backups .

[»] Note

SAP HANA does not lose transactions, even if a power failure should occur, because it does not commit a transaction to the in-memory database unless it is also written to the disks in the log volume.

Startup

The data in the data volume is not a “database.” Instead, think of it as a memory dump —an organized and controlled one. If it was a database on disk, it would have to convert the disk-based database format to an in-memory format each time it started up. Now, it can just reload the memory dump from disk back into memory as quickly as possible.

When you start up an SAP HANA system, it performs what is called a lazy load . It first loads the system tables into memory and then the row store. Finally, it loads any column tables that you have specified. Because the data volumes are only updated by the savepoint every 10 minutes, some transactions might not be in the data volume; SAP HANA loads those transactions from the log volume. Any transaction that was committed stays committed, and any transaction that was not committed is rolled back. This ensures a consistent database.

SAP HANA then reports that it is ready for work. This ensures that SAP HANA systems can start up quickly. Any other data that is required for queries gets loaded in the background. Again, SAP HANA does not have to load an entire table into memory; it can load only certain columns or part of a column into memory. In a scale-out system, it will load only the required partitions.

Disks are therefore useful for getting the data back into memory after system maintenance or (potential) hardware failures.

Disaster Recovery

At the bottom right of Figure 3.8, you can see that the data volume and logs also are used to facilitate disaster recovery. In disaster recovery, you make a copy of the SAP HANA system in another data center. Should your data center be flooded or damaged in an earthquake, your business can continue working from the disaster recovery copy.

The initial large copy of the system can be performed by replicating the data volume . Once the initial copy is made, only the changes are replicated as they are written to the log volume. (In the older versions of SAP HANA, the changes are continuously replicated in both the data and the log volumes.)

This replication can be performed by the SAP HANA system itself, or by the storage vendor of the disks in the data and log volumes. The replication can happen either synchronously or asynchronously. You can also choose whether to use dedicated hardware for the disaster recovery copy, or shared hardware for development and quality assurance purposes.

Disaster recovery is covered in more detail in the HA200 course, and in the SAP HANA administration guide on http://help.sap.com/hana_platform.

Data Aging

The final piece of the persistence layer is found at the top right of Figure 3.8. This is where disks are used for data aging.

SAP HANA has a feature called dynamic data tiering. You can specify data aging rules inside SAP HANA to take, for example, all data older than five year and write it to this area. All queries to the data still work perfectly, but are just slower as they are read from disk instead of memory.

The data in the in-memory portion of SAP HANA is often referred to as the hot data , while the data stored in the dynamic data tiering portion is referred to as warm data . We can later expand this to include cold data , by using SAP HANA smart data access (SDA) with Hadoop. We discuss this in Chapter 14.

Deployment Scenarios

People sometimes get confused about SAP HANA when it is only defined with a single concept, such as SAP HANA is a database. After a while, they hear seemingly conflicting information that does not fit into that simple description. In this section, we’ll look at the various ways that SAP HANA can be used and deployed: as a database, as an accelerator, as a platform, or in the cloud.

When looking at SAP HANA’s in-memory technology, and especially the fact that we should ideally move code away from the application servers to the database server, it seems that SAP systems need to be rewritten completely. Because SAP HANA does not need to store values like year-to-date, aggregates, or balances, we should ideally remove them from the database. By doing so, we can get rid of a significant percentage of the tables in, for example, a financial system. However, that implies that we should also trim the code that calculated those aggregates and inserted them into the tables we removed. In this way, we can also remove a significant percentage of our application code.

Next, the slow parts of the remaining code that manipulates data should be moved into graphical calculation views in SAP HANA, removing even more code from our application. If we follow this logic, we obviously have to completely rewrite our systems. This is not easy for SAP, which has millions of lines of code in every system and an established client base. This approach can be very disruptive. Disruptive innovation is great when you are disrupting other companies’ markets, but not so much when you disrupt your own markets. Ask an existing SAP customer if he wants this new technology, and he probably will say, “Yes, but not if it means a reimplementation that will require lots of time, money, people, and effort.” Completely rewriting the system would lead to customers replacing their systems with the new solution; that also means that customers with the older systems would not benefit at all from this innovation.

If you look at the new SAP S/4HANA systems, you will see that they follow the more disruptive approach we just described. However, there are other ways in which we can implement this disruptive new technology in a nondisruptive way—and this is how SAP approached the process when implementing SAP HANA a few years ago. The following sections walk through the deployment scenarios for implementation.

SAP HANA as a Sidecar Solution

The first SAP HANA deployment scenario is a standalone, native, or “sidecar” solution. Think of a motorcycle with a passenger on the side in their own little car seat; that seat is the sidecar. Figure 3.9 illustrates SAP HANA as a sidecar solution.

Figure 3.9 SAP HANA Deployed as Nondisruptive Sidecar Solution

In this deployment scenario, the source systems stay as they are. No changes are required. We add SAP HANA on the side to accelerate any transactions or reports that have performance issues in the source systems and duplicate the relevant data from the source system to SAP HANA. This duplication is normally performed by the SAP LT Replication Server (SLT). (We’ll discuss SLT in more detail in Chapter 14.) The copied data in SAP HANA is then used for analysis or reporting.

The advantages of this deployment option include the following:

Because we just copy data, we do not change the source system at all. Therefore, everyone can use this innovation, but without the disruption (even those who have very old SAP systems, such as SAP R/3 4.6C).

We can use this option for all types of systems, not just the latest SAP systems. In fact, a large percentage of the systems that use SAP HANA are non-SAP systems.

We don’t have to copy the data for an entire system. If a report already runs in three seconds, that’s good enough; it doesn’t have to run in three milliseconds. If a report runs for two hours, but runs only once a year and is scheduled to run overnight, that’s fine too. We only take the data into SAP HANA that we need to accelerate.

We can replicate data from a source system in real time. The reports running on SAP HANA are always fully up-to-date.

Even though we make a copy of data, remember that SAP HANA compresses this data and that we do not copy all the data. Therefore, the SAP HANA system is always much smaller than the source system(s) in this deployment scenario.

However, the SAP HANA system in this scenario is blank . You will have to load the data into SAP HANA, build your own information models, link them to your reports and programs, and look after the security. Much of what we will discuss in this book is focused on this type of work.

SAP HANA as a Database

When SAP released the sidecar solution, many people thought that it would replace their data warehouses. However, this worry was alleviated when SAP announced that SAP HANA would be used as the database under SAP Business Warehouse (SAP BW ). Figure 3.10 illustrates a typical (older) SAP landscape.

Figure 3.10 Typical Current SAP Landscape Example

SAP HANA can now be used to replace the databases under SAP BW, SAP CRM, and SAP ERP (which are part of the SAP Business Suite). There are well-established database migration procedures to replace a current database with SAP HANA. In each case, SAP HANA replaces the database only. The SAP systems still use the SAP application servers as they did previously, and end users still access the SAP systems as they always did, via the SAP application servers. All reports, security, and transports are performed as before. Only the system and database administrators have to learn SAP HANA and log into the SAP HANA systems.

Figure 3.11 looks very similar to Figure 3.10, except that the various databases were replaced by SAP HANA.

Figure 3.11 SAP HANA as DatabaseDatabasedeploymentSAP HANAas a databaseDeploymentdatabase Can Change Current Landscape

Figure 3.11 SAP HANA as Database Can Change Current Landscape

By only replacing the database, SAP again managed to provide most of the benefit of this new in-memory technology to businesses, but without the cost, time, and effort of disruptive system replacements. The applications’ code was not forked to produce different code for SAP BW on SAP HANA versus SAP BW on older databases.

SAP HANA as an Accelerator

Using SAP HANA in the sidecar deployment can be a lot of work because it is a “blank” system. SAP quickly realized that special use cases occurred at a majority of their customers where some common processes were slow; one of these processes is profitability analysis . Profitability is something that any company needs to focus on, but profitably analysis reports are slow running due to the large amounts of data they have to process. By loading profitability data into SAP HANA, the response times of these reports go from hours to mere seconds (see Figure 3.12).

Figure 3.12 SAP HANA Used as an Accelerator for a Current SAP Solution

For these accelerators, SAP provides prebuilt models in SAP HANA. SAP also slightly modified the ABAP code for SAP CO-PA to read the data from SAP HANA instead of the original database. SLT replicates just the required profitability data across from the SAP ERP system to the SAP HANA system.

The installation and configuration process is performed at a customer site over a few days, instead of weeks and months. The only thing that changes for the end users is that they have to get used to the increased processing speed, and no one has a problem with that!

SAP HANA as a Platform

The next way we can deploy SAP HANA is by using it as a full development platform. SAP HANA is more than just an in-memory database.

Which other database has several different programming languages included and has version control, code repositories, and a full application server built in?

SAP HANA has all of these. This makes SAP HANA not just a database, but a platform. You can program in SQL , SQLScript (an SAP HANA-only enhancement that we’ll look at in Chapter 8), R, and JavaScript.

However, the main feature we use inside SAP HANA when deploying SAP HANA as a platform is the application server, called the SAP HANA extended application services (SAP HANA XS) engine, which is used via web services (see Figure 3.13).

Figure 3.13 SAP HANASAP HANAas a development platformDeploymentdevelopment platform as a DevelopmentDevelopment platform deploymentSAP HANA XS Platform

Figure 3.13 SAP HANA as a Development Platform

The SAP HANA XS engine is a full web and application server built into SAP HANA and is capable of serving and consuming web pages and web services.

When we deploy SAP HANA as a platform, we only need an HTML5 browser and the SAP HANA XS engine. In this case, we still develop data models as we discussed previously with the sidecar and accelerator deployments, but we also develop a full application in SAP HANA, using JavaScript and an HTML framework called SAPUI5 . All this functionality is built-in to SAP HANA.

The development paradigm that most frameworks use is called Model-View-Controller (MVC). Model in this case refers to the information models that we will build in SAP HANA and is the focus of this book. We expose these information models in the controller by exposing them as OData or REST web services. This functionality is available in SAP HANA and is easy to use. You can enhance these web services using server-side JavaScript . This means that SAP HANA runs JavaScript as a programming language inside the server.

The last piece is the view, which is created using the SAPUI5 framework. The screens are the same as the SAP Fiori screens seen in the new SAP S/4HANA systems, which are HTML5 screens. What is popularly referred to as HTML5 actually consists of HTML5 , Cascading Style Sheets (CSS3 ), and client-side JavaScript . In this case, the JavaScript runs in the browser of the end-user.

By using JavaScript for both the view and the controller, SAP HANA XS developers only have to learn one programming language to create powerful web applications and solutions that leverage the power and performance of SAP HANA.

We can even create native mobile applications for Apple iOS, Android, and other mobile platforms. Some free applications such as Cordova take the SAP HANA XS applications and compile them into native mobile applications, which can then be distributed via mobile application stores. As a result, you can use SAP HANA’s power directly from your mobile phone.

Multitenant Database Containers

Multitenant Database Containers (MDC) is a new deployment option that became available as of SAP HANA SPS 09. We can now have multiple isolated databases inside a single SAP HANA system, as shown in Figure 3.14. Each of these isolated databases is referred to as a tenant.

Figure 3.14 Multitenancy with SAP HANA Multitenant Database Containers

Think of tenants occupying an apartment building: Each tenant has his or her own front door key and own furniture inside his or her apartment. Other tenants cannot get into another tenant’s apartment without permission. However, the maintenance of the entire apartment building is performed by a single team. If the building needs repairs to the water pipes, all tenants will be without water for a few minutes.

In the same way, tenants in MDC have their own unique business systems, data, and business users. In Figure 3.14, you can see that each of the three tenants has his own unique application installed and connected to their own tenant database inside SAP HANA.

Each tenant can make its own backups if the user wishes. Tenants can also collaborate, so one tenant can use the SAP HANA models of another tenant if they has been given the right security access.

However, all the tenants share the same technical server infrastructure and maintenance. The system administration team can make a backup for all the tenants. (Any tenant can be restored individually, though.) SAP HANA upgrades and maintenance schedules happen at the same time for all the tenants.

Cloud Deployments

We've now discussed various ways to deploy SAP HANA. Most of the time, we think of these deployments as running on-premise on a local server, but all of these scenarios can also be deployed in the cloud. In cloud deployments, virtualizations manage applications like SAP HANA . VMware vSphere virtual machine deployments of SAP HANA are fully supported, both for development and production instances.

With cloud deployments, you’ll be introduced to various new terms:

Infrastructure as a Service (IaaS)
An infrastructure includes things like network, memory, CPU, and storage. IaaS provides infrastructure in the cloud to support your deployments.

Platform as a Service (PaaS)
PaaS uses solutions such as SAP HANA Cloud Platform (HCP). In this scenario, you use not only the same infrastructure as IaaS, but also a platform—for example, a Java application service or the SAP HANA platform.

Software as a Service (SaaS)
A cloud offering that provides certain software, such as SAP SuccessFactors or SAP Hybris.

Managed Cloud as a Service (MCaaS)
In MCaaS, SAP manages the entire environment for you, including backups, disaster recovery, high availability, patches, and upgrades. An example of this is the SAP HANA Enterprise Cloud (SAP HEC).

Public, Private, and Hybrid Cloud

Your cloud scenario could be a public, private, or hybrid cloud scenario. Let’s look at what each of those terms means and when each scenario would apply.

The easiest and cheapest way for you to get an instance of SAP HANA for use with this book is probably via Amazon Web Services (AWS). You can simply go to http://aws.amazon.com and look for an SAP HANA development environment. By using your Amazon account, you can have a 28 GB or larger SAP HANA system up and running in minutes. This is available to everyone—the public—so therefore this is a public cloud. In this example, AWS provides IaaS.

The reason this is called a public cloud is that the infrastructure is shared by members of the public. You do not get your own dedicated machine at Amazon or any other public cloud provider.

If you host SAP HANA in your own data center or in a hosted environment specifically for you only, then it is a private or hybrid cloud . A private cloud offers the same features that you would find in a public cloud, but is dictated by different forms of access. With a private cloud , the servers in the cloud provider’s data center are dedicated for your company only. No other companies’ programs or data are allowed on these servers.

A hybrid cloud involves either combining a public and private cloud offering, or running a private and public cloud together. With a hybrid cloud , you connect your private cloud’s server back to your own data center via a secure network connection. In that case, the servers in the cloud act as if they are part of your data center, even though they are actually in the cloud provider’s data center. An example of a hybrid cloud would be if we used Amazon EC2 and operated another private cloud together.

SAP HANA works great as a cloud enabler because it accelerates applications, providing users with fast response times—which we all expect from such environments.

SAP HANA Enterprise Cloud

SAP HEC offers SAP applications as cloud offerings. You can choose to run SAP HANA, SAP BW, SAP ERP, SAP CRM, SAP Business Suite, and many others as a full service offering in the cloud. All the systems that you would normally run on-premise are available in SAP HEC.

SAP provides the hardware, infrastructure, and all system administration services to manage the solutions for the customer. The customer only has to provide the license keys for these systems. This is referred to as bring your own license .

SAP HEC is seen as an MCaaS solution and a private cloud , but it can be deployed in a hybrid cloud offering.

SAP HANA Cloud Platform

SAP HANA Cloud Platform (SAP HCP ) is a PaaS that allows you to build, extend, and run applications in the cloud. This includes SAPUI5, HTML5, SAP HANA XS, Java applications, and more.

The SAP HCP Cockpit is at the core of this technology where you can manage and create all of your applications.

[»] Additional Resources

For more information on cloud deployments, including details on SAP HEC and SAP HCP, check out the book Operating SAP in the Cloud: Landscapes and Infrastructures (SAP PRESS, 2016) at https://www.sap-press.com/3841.