8    Integrating External Data

Because companies might use non-SAP databases to store their data, integrating SAP and non-SAP data is critical for enterprise reports. In this chapter, we discuss SDA and SDI in SAP BW/4HANA, which make accessing external data more efficient and cost-effective.

Companies often replicate or load data from other systems into an SAP BW/4HANA system and then perform analysis and processing in SAP BW/4HANA. However, not only does data loading or replication consume a lot of time and system resources (i.e., CPU and memory), but also, it can be difficult to manage delta loading from source systems to SAP BW/4HANA.

In this chapter, we’ll look at how SAP HANA smart data access (SDA) and SAP HANA smart data integration (SDI) can help you load your data into the SAP BW/4HANA system. We’ll then discuss the options for integrating data with the SAP BW/4HANA system, including combining data under a BW schema or via Open ODS views.

8.1    SAP HANA Smart Data Access

SDA enables remote data to be accessed via SAP HANA SQL queries as if they’re local tables in SAP HANA, without copying the data into the SAP HANA database. With SDA, data can be combined from heterogeneous remote source systems—such Microsoft SQL Server, Oracle, DB2, Microsoft Excel, Teradata, Sybase IQ, and Hadoop—into EDW landscapes. SDA makes it possible to access remote data without having to replicate the data into the SAP HANA database. The following are some use cases for which to consider using SDA:

SDA not only provides operational and cost benefits, but also supports the development and deployment of the next generation of analytical applications, which require the ability to access and integrate data from multiple systems in real time regardless of where the data is located or which system is generating it.

SDA is based on local virtual tables that map to existing remote tables in the remote database. Data required from remote sources will be visible in virtual tables in SAP HANA. SDA will enable remote access (smart access) to data regardless of where the data is located.

SAP HANA developers can then create SAP HANA views on top of the virtual tables or combine data from virtual tables and internal SAP HANA tables in SAP HANA Studio. The SAP HANA query processor optimizes these queries and executes the relevant parts of the queries in the remote database, returns the results of the query to SAP HANA, and completes the query.

Not only can an SAP HANA developer select data in a virtual table and pull data from a remote data source into an SAP HANA table, but SDA even supports INSERT/UPDATE/DELETE statements executed against the virtual table. The data in a virtual table can be modified (i.e., via insert, update, and delete operations) and then written back to the remote table in a remote data source.

Before you create remote connections to external databases, it’s a good idea to examine if your databases are supported by SDA or SDI in SAP BW/4HANA, as we’ll discuss in the following section. We’ll then briefly look at the steps to configure SDA and how to create remote sources.

8.1.1    Supported Remote Sources

As of SAP HANA SPS 11, the following remote sources are supported by SDA (refer to SAP Note 1868209 for more information):

8.1.2    Configure SAP HANA Smart Data Access

The steps to configure are as follows:

  1. Download and install the Unix ODBC driver manager (2.3.0 or above) from http://www.unixodbc.org/ and use it to test the connection to external databases in the Linux box on the SAP HANA server. Because you need administrator rights, you may need to get help from your SAP Basis team to install the driver manager. For more information, see the SAP HANA Administration Guide at https://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf.
  2. Download and install the Unix ODBC drivers for external databases from each database provider’s website. Because installing Unix ODBC drivers requires administrator rights, you may need to get help from your SAP Basis team. For more information, see the SAP HANA Administration Guide.
  3. To start the process to create the odbc.ini file (which SAP HANA requires to communicate with remote databases) in $HOME directory in the SAP HANA Linux server, log in with the admin user and create odbc.ini file.
  4. Create a DSN for external databases. You will perform the following:
  5. Restart SAP HANA.
  6. Test the ODBC connection in the Linux box. Before you create remote sources for SAP HANA Studio, you should test the ODBC connection in the Linux box to ensure the connection has been set up correctly. In the Linux server, you can use the following command to test the connection:
    isql -v <DSN> <user name> <password>
  7. Provide the following parameters to connect to the database:
    • DSN: Enter the data source name defined in odbc.ini file.
    • Enter the user name and password.
  8. If the Unix ODBC connection has been set up correctly, the result should look something like Figure 8.2.
    Testing ODBC Connection

    Figure 8.2    Testing ODBC Connection

8.1.3    Create Remote Source via Smart Data Access Adapter

The following privileges are required in SAP HANA to manage the agent and adapters and to create remote sources and virtual tables in SAP HANA Studio:

Once the SAP HANA administrator has granted those roles to you, you can create remote sources in SAP HANA Studio to connect the external databases defined in odbc.ini in the Linux box, as in Section 8.1.2.

Log on to SAP HANA Studio, as shown in Figure 8.3, and go to Provisioning • Remote Sources. Right-click Remote Sources and select New Remote Source.

Create New Remote Source

Figure 8.3    Create New Remote Source

A new screen will open, which should look something like Figure 8.4 and should include the following parameters:

The other parameters will change according to the adapter selected, which will be explained in more detail in the next few sections.

Create Remote Source Connection to SQL Server via SDA Adapter

With an SDA adapter, a remote source can be created to connect to the SQL Server 2012 database. The MSSQL (GENERIC ODBC) adapter is an SDA adapter and will be used to connect to the SQL Server database. The properties shown in Figure 8.6 need to be populated as follows:

Note

If the database name isn’t specified in odbc.ini, only the default database will be available under the Provisioning folder in SAP HANA Studio.

You can follow Figure 8.6 to create remote sources using the graphic method, but you can also create remote source connections to remote databases with SQLScript. Listing 8.1 provides a template for SQLScript to create remote sources for SQL server; it’s been tested in SAP HANA SPS 11. You can copy the script to the SQL console, change the parameters accordingly, and execute the script to create the remote source connection to SQL server.

CREATE REMOTE SOURCE <input your remote source name> 
ADAPTER "odbc"
CONFIGURATION FILE 'property_mss.ini'
CONFIGURATION 'DSN=<input DSN>'
WITH CREDENTIAL TYPE 'PASSWORD'
USING 'user=<Input username>; password=<Input the password>';

Listing 8.1    Create Remote Source for MS SQL Server

Create Remote Source Connection to Hadoop

SDA can support access to Hadoop remotely. To connect to a Hadoop system, the Simba ODBC driver needs to be installed in the Linux box on the SAP HANA server; the driver can be downloaded from http://startupfocus.saphana.com/spark/spark-sap-hana-integration/.

Figure 8.7 illustrates the role that ODBC drivers play in the communication between SAP HANA and a Hadoop cluster.

Connect Hadoop via ODBC

Figure 8.7    Connect Hadoop via ODBC

In this case, choose HADOOP (ODBC) for Adapter Name, select ATA Source Name for Connection Mode, specify the Data Source Name as defined in odbc.ini, and provide the User Name and Password to connect the Hadoop system from SAP HANA Studio.

You can also choose the SPARK SQL adapter to create the remote source. See Chapter 9, Section 9.3.2 on NLS configuration for more details.

Figure 8.8 shows the properties to create a Hadoop remote source.

Properties to Create Hadoop Remote Source

Figure 8.8    Properties to Create Hadoop Remote Source

You can also create remote source connection to a remote Hadoop system with SQLScript. Listing 8.2 provides an SAP HANA SQL script template to create a remote source for Hadoop; it’s been tested in SAP HANA SPS 11. You can copy the script to the SAP HANA SQL console, change the parameters accordingly, and execute it to create the remote source connection for Hadoop.

CREATE REMOTE SOURCE <input your remote source name for hadoop> 
ADAPTER "hiveodbc"
CONFIGURATION 'DSN=<specify DSN defined in odbc.ini file>'
WITH CREDENTIAL TYPE 'PASSWORD'
USING 'user=<provide the username>; password=<provide the password>';

Listing 8.2    Create Remote Source for Hadoop