11.3 Rapid Data Migration
This section introduces the Rapid Data Migration solution, which serves to migrate data to SAP S/4HANA, on-premise. Rapid Data Migration is an SAP Best Practices package that provides migration content for the SAP Data Services tool. The data migration approach with SAP Data Services and Rapid Data Migration focuses on the data quality and data validation for SAP S/4HANA.
11.3.1 Tools
The SAP Data Services tool is a product from the SAP Enterprise Information Management (EIM) portfolio, which provides functions for integrating data (Data Integrator) and ensuring data quality (Data Quality).
SAP Data Services is a proven and tested ETL (Extract, Transform, and Load) tool, which has a graphical user interface (designer) and can be connected to numerous source systems (extract) and target systems (load) through various interfaces. You’ll map (transform) the data on the tool’s drag-and-drop user interface.
SAP Data Services also enables you to continuously improve the quality of your imported data both before and during the data migration. Unlike traditional migration scenarios, you’ll avoid migrating incorrect, redundant, and unnecessary data records.
In addition, Rapid Data Migration uses the SAP BusinessObjects BI platform for advanced, but optional, data migration monitoring. The platform provides predefined reports as SAP Best Practices to support data migration projects for analytics and troubleshooting. The reports are created as SAP BusinessObjects Web Intelligence reports, which allows you to support your migration projects by determining data quality and mapping issues at an early stage.
In addition to SAP Data Services, which is a standalone software to connect source and target systems, SAP also delivers further helpful tools in the EIM portfolio that you can use for data migrations. Of particular interest is the SAP Information Steward tool. Both tools provide the profiling function (to identify similarities and structures in data) and deduplication function for data (to find duplicates) and data lineage (to reconcile data between source and target systems).
[»] Rapid Data Migration Not Only for SAP S/4HANA
In addition to the data migration content for SAP S/4HANA, SAP provides several Rapid Data Migration packages for target systems, such as SAP ERP, SAP Customer Relationship Management (CRM), SAP Business Suite on SAP HANA, SAP SuccessFactors (for example, SAP SuccessFactors Employee Central), and SAP Hybris Cloud for Customer (C4C). You can find an overview of these packages at the following link: http://service.sap.com/public/rds-datamigration.
Because you’ll use a standardized ETL tool, you won’t have to use custom one-off programs for migrating to SAP S/4HANA but will use standard interfaces, such as IDocs (Intermediate Documents), BAPIs (Business Application Programming Interfaces), and SAP function modules. Using SAP Data Services for data migrations provides further advantages, for example:
- Direct connection to one or more source systems via database interfaces (both SAP ERP and non-SAP systems)
- Additional integration of data from CSV files, flat files, and Microsoft Excel files
- Standardization of source data from various legacy systems and files into a unified format
- Data record cleansing on the source system
- Deduplication of data and determination of the entire data record that merges and replaces duplicates (referred to as the golden record)
- Start of the mapping and validation process before you’ve completed the Customizing of SAP S/4HANA
- Simple and reusable mapping via drag and drop
- Visualization of the entire data flow from the source system to the target system
- Reusable check routines to minimize custom code
- Validation of external data against SAP S/4HANA check routines without having to load data records into SAP S/4HANA
- Test runs without updating data
- Usage of SAP S/4HANA standard interfaces
11.3.2 Architecture
The Rapid Data Migration solution combines SAP Data Services with migration content specifically developed for data migrations. Technically, the SAP data migration solution consists of three components:
- The actual software
- A database server
- A web server
The database server is used to manage repositories in several separate database instances. For example, one repository contains the entire SAP Data Services content and metadata about interfaces, which describes the interface structure as “data about data.”
The web server permits access to the software through a web browser, for example, via the Central Management Console (CMC) for reporting with SAP BusinessObjects BI platform.
In addition to the software components, the SAP S/4HANA data migration solution package contains the following components:
- Data migration templates (content) including mappings for SAP Data Services (see Section 11.3.3)
- Migration Services tool for value mapping (see Section 11.3.7)
- Reports from SAP BusinessObjects Web Intelligence for monitoring and reporting (see Section 11.3.10)
- Content for the reconciliation between SAP target system and source system(s) (see Section 11.3.10)
SAP provides these components as a SAP Best Practices package for SAP S/4HANA, Rapid Data Migration. In addition to predefined business best practices content and implementation best practices for configuration, Rapid Data Migration packages contain services for data migration projects, provided by SAP Consulting or SAP partners. However, you can use this SAP Best Practices content yourself without being supported by SAP or an SAP partner by downloading the package from the SAP Best Practices Explorer free of charge (see Section 11.3.3).
The predefined migration content for SAP Data Services contains the metadata of the SAP S/4HANA target interface as well as validations for simplifying the source mapping. More than 50 business objects are supported. Theoretically, every IDoc, every asynchronous BAPI, and every web service can be used. Even RFC-enabled function modules, which are used in the standard content for the business partner object, are possible. (See Chapter 7, Table 7.2, for a list of available migration objects.)
[eg] Sample Migration Content
Examples of predefined templates include the following:
- For business partners with customer and supplier masters
- For logistics data such as material masters, bills of materials (BOMs), and sales documents
- For SAP Financials (FI) data, such as receivables and payables
For BAPIs without a provided IDoc interface, you can use Transaction BDBG to easily create a BAPI/ALE interface in the customer namespace. You can then also enhance SAP Data Services content without any restrictions. The SAP Best Practices package includes an enhancement guide for more information.
With Rapid Data Migration content, the ETL tool, SAP Data Services, becomes a data migration platform perfectly attuned to SAP S/4HANA, on-premise. Figure 11.4 illustrates the corresponding architecture with SAP Data Services in the center.
Figure 11.4 Architecture of the Rapid Data Migration Solution
The platform itself runs on a relational database that is also used as a staging area. SAP Data Services can connect to any source systems via adapter frameworks, such as Open Database Connectivity (ODBC), file interfaces, Mainframe, XML files, and Microsoft Excel files:
-
Source and target system
On the left side of Figure 11.4, the system is integrated with one or more legacy systems via various interfaces; on the right side, the system is integrated with an SAP S/4HANA system. For new implementations of SAP S/4HANA, the legacy system can be an SAP ERP system or any other non-SAP system. However, legacy systems can also be mixed systems (for example, an SAP Business Suite system with multiple external add-on systems that all will be consolidated into one SAP S/4HANA system). -
Extraction and profiling
The staging area between the source and target systems is provided by the database on which SAP Data Services runs, which may be, but not always is, the SAP HANA database. Depending on the size of the repository, we recommend maintaining a smaller landscape, for example, on the basis of an SAP Adaptive Server Enterprise (ASE) database, which can also be used with an SAP HANA database license.
In this step, you’ll extract and analyze data from the source system. This analysis (profiling) is a critical step that provides detailed insights into the legacy system. These insights enable you to determine data patterns and check important details: For example, do all ZIP codes for the US have five digits (plus possible 4 digit extensions) and are only numbers used? What notations are used to designate the United States in your legacy system (e.g., “United States of America,” “U.S.,” and/or “USA”)? -
Cleansing, conversion, validation, and loading
This step includes the cleansing of data records so that they follow a specific pattern, the conversion of certain rules, and, finally, the reconciliation with the Customizing for SAP S/4HANA. These processes can entail, for example, merging two fields into one field, dividing fields, converting values to a specific format (for example, converting telephone numbers into the international format with “+1” for the U.S., for example), and validating mandatory fields and check tables. The cleansed and verified data is then imported to the SAP S/4HANA system. -
Customizing extraction from SAP S/4HANA
Because you can configure SAP S/4HANA, you’ll need to transfer the Customizing (for example, for company codes, plants, material types, and material groups) to the intermediate layer of SAP Data Services by replicating the Customizing in SAP Data Services using predefined content. This process enables you to ensure, in SAP Data Services, that the data records you want to import are compatible with the SAP S/4HANA system. You can repeat the delta reconciliation several times if required, for example, if you need to make changes to the Customizing in the SAP S/4HANA system. -
Data reconciliation
After the data has been loaded, the data that is actually imported to the SAP S/4HANA system is reconciled with the expected data for the SAP Data Services migration. -
Dashboards and reporting
The technical and functional resources involved in the data migration can trace the entire process anytime using dashboards and reports. As a result, the status of the data transfer is always transparent.
You can continue to use SAP Data Services as a fully functional connection and orchestration platform for the master data integration from several systems or for processes to ensure data quality (data governance) after the data has been transferred successfully.
11.3.3 Migration Content
The data migration content, which is available in packages, contains jobs for SAP Data Services. One job is delivered for each business object, and this job usually corresponds to one IDoc type. (IDocs can also call BAPIs; see below.)
[»] Customer Vendor Integration
SAP S/4HANA introduced innovation into the traditional customer master migration object. Customers and vendors are merged by Customer Vendor Integration (CVI) via the business partner interface for which no IDoc or BAPI is available. A new and specially developed interface in SAP S/4HANA can be addressed using a remote function module (via Remote Function Call, RFC) and is used by the Rapid Data Migration content in SAP Data Services.
All provided and modeled jobs are used as templates for the SAP Data Services platform and are available in a proprietary file format (.atl), which is a file format specific to SAP Data Services. Moreover, these packages also contain documentation as installation guides specific to the business content. These guides also include mapping templates for all business objects (tables for mapping on paper), enhancement guides (to add custom interfaces or customer-specific fields), and business process descriptions (for each business object) to enable you to understand the IDoc structure in detail.
[»] SAP Best Practices Migration Content beyond SAP S/4HANA
SAP not only provides free migration content for SAP S/4HANA but also for the SAP solutions listed below. You can download these packages for free (you’ll only need your SAP login, such as an S user) in the SAP Best Practices Explorer at the following links:
-
SAP Business Suite on SAP HANA:
http://rapid.sap.com/bp/RDM_ERP_CRM -
SAP ERP (including SAP Retail and SAP HCM):
http://rapid.sap.com/bp/RDM_ERP_CRM -
SAP CRM:
http://rapid.sap.com/bp/RDM_ERP_CRM -
SAP Billing for Utilities:
http://rapid.sap.com/bp/RDM_CRM_UTIL -
SAP SuccessFactors Employee Central:
http://rapid.sap.com/bp/RDM_SOD_SFSF -
SAP Hybris Cloud for Customer (C4C):
http://rapid.sap.com/bp/RDM_SOD_SFSF
For more information, access the SAP Service Marketplace at:
or
(You might have to log on with your user credentials.)
Note that the predefined content can be localized for various countries, but technical content is provided in the English language only.
The import function in SAP Data Services lets you easily upload all available objects from the .atl files provided. Similarly, you can also use this function to save your mapping specifications or validations for reuse in other projects. We recommend using the export function for regular backups.
Now, let’s turn to migrating data to SAP S/4HANA. The individual ETL functions of the SAP Data Services platform are beyond the scope of this book and not discussed in detail here.
[»] No Separate License Required
The product license for SAP Data Services covers both the ETL part (Data Integrator) and data cleansing part (Data Quality). If you do not have a separate software license for SAP Data Services and do not have to cleanse your data, you can request a free Data Integrator key code anytime by using your SAP HANA database license.
You can then use this key code with a valid SAP HANA REAB (Runtime Edition for Applications and SAP BW) database license, or with the full SAP HANA Enterprise database license, to load data to SAP S/4HANA or to the SAP HANA database without having to pay for additional licenses. As SAP HANA database licenses, either license (SAP HANA REAB or SAP HANA Enterprise) is included in the SAP S/4HANA license package.
For more information, go to https://blogs.sap.com/2016/06/21/how-to-migrate-to-sap-s4hana/.
The following link navigates you to a guide describing how to request the required key code: https://blogs.sap.com/2016/06/20/request-an-sap-data-integrator-key-code-for-rapid-data-migration-to-sap-s4hana/.
Interfaces as Part of the Migration Content
For migrating data to SAP S/4HANA, as mentioned earlier, the Rapid Data Migration solution uses IDocs, which is the SAP standard interface technology, for all objects (because all BAPIs are called via their IDoc interfaces) except for business partners to send data to SAP S/4HANA. As part of the content, the structure of the IDocs as well as their fields have been replicated as metadata in SAP Data Services. As a result, you can map the source system (your legacy system) to the SAP S/4HANA target structure in the SAP Data Services Designer.
An IDoc is a hierarchically nested structure. Individual data records in IDocs are called segments. IDocs are updated via function modules. Unlike direct updates, the ALE layer is used here. Consequently, the IDocs are always addressed via the same function module in the SAP S/4HANA target system: the RFC-enabled IDOC_INBOUND_ASYNCHRONOUS module. SAP Data Services is responsible for the entire process, and you won’t have to do anything. The business partner object is an exception (see Chapter 7, Section 7.3.1, Table 7.2) because it is directly and remotely addressed by a wrapper function module (via RFC).
IDocs always have defined statuses in the SAP system. The most important status values for the data transfer in inbound IDocs are the following:
-
Status 64 (waiting)
You can transfer the IDoc to the application. -
Status 53 (IDoc successfully updated)
The application document was updated. -
Status 51 (error in IDoc)
The application document was not updated.
You can differentiate between IDoc message types, which indicate the semantics of IDocs, and IDoc basic types, which define their syntax. For example, the IDoc message type ORDERS is responsible for order data, while the different versions of the basic types, ORDERS04 or ORDERS05, specify the exact syntax of the segments and all fields contained therein. The version concept specifies that fields and segments can always be added but never deleted. Consequently, ORDERS05 covers all functions of ORDERS04 plus additional new fields. This version control ensures that new systems can handle obsolete IDocs (upward compatibility) and that new systems with a higher version can send IDocs to older systems (downward compatibility).
The relationship between a message type and a basic type, however, is not always 1:n, as may seem at first, but n:m because the IDoc basic type ORDERS05 transfers two logical message types: In addition to purchase orders in the ORDERS IDoc message type, the ORDRSP message type for purchase order confirmations is also available, resulting in different meanings for the IDoc messages.
An enhancement concept is available for IDocs, which allows for IDoc extensions, such as ZORDERS05, which combines additional customer-specific fields or customer segments in the so-called IDoc type.
Although IDocs are a broad topic, this level of detail on IDoc technology is sufficient in this context. SAP Data Services will perform the tasks for you using the Rapid Data Migration content and will ensure that the IDocs are set up with the correct IDoc control records and syntax.
Over time, IDocs have proven to be solid and consistent interfaces with a smart version concept. Moreover, updates of IDocs are secure because the IDocs are completely rolled back and available for processing again when the update is canceled. For these reasons, SAP provides Transaction BDBG as a tool to generate IDoc interfaces based on asynchronous BAPIs at the touch of a button. Asynchronous BAPIs are BAPIs that can load data independently to a system instead of having to carry out an operation and providing a response. The only information returned is a success or failure indicator, similar to traditional IDocs. Transaction BDBG allows you to extend the IDoc world considerably.
Most of the BAPIs already include the BAPI ALE interface, which you can use to generate the IDoc structure and IDoc type from the BAPI. The IDoc will serve as a wrapper around the BAPI, and data will be sent as IDoc instead of directly calling a BAPI in a remote system. The inbound IDoc is then “unwrapped,” and the BAPI is called locally in the target system. In general, a BAPI is an SAP function module with a defined interface and documentation. The advantage of using BAPIs is that the send process is decoupled from processing, as is the case for Application Link Enabling (ALE). Otherwise, the connection between the systems must be open during the entire BAPI processing time.
Example: Migrating Bank Master Data
The following sections focus on a simple example: transferring bank master data to an SAP S/4HANA system.
The object is updated in the SAP S/4HANA system using the BAPI BAPI_BANK_CREATE. However, the BAPI is called via the BAPI ALE interface and, thus, via an IDoc. As a result, in this case, SAP Data Services first sends an IDoc to the SAP S/4HANA system, while the BAPI is directly called in the SAP S/4HANA system if the SAP S/4HANA migration cockpit is used (see Section 11.4). The SAP S/4HANA migration cockpit will run as an application in SAP S/4HANA, while you’ll be able to use SAP Data Services for mapping as a standalone tool at an early stage without having to connect your source and target systems.
Our example is restricted to the required bank master data only and uses two IDoc segments of the generated IDoc type, BANK_CREATE01 (message type BANK_CREATE). The technical SAP names of the segments are E1BANK_CREATE and E1BP1011_ADDRESS.
In the SAP Data Services content, these segments are BANKHeader_E1BANK_CREATE_Required and BANKBankAddress_E1BP1011_ADDRESS_Required. These names indicate, respectively, a header record and an address record. In contrast to most objects, our brief example won’t include further IDoc segments at lower levels. Depending on the IDoc type definition, these deeper nested structures can be repeated several times but are not always mandatory.
Due to the default IDoc structure used, the content from SAP Data Services has the same structure for every business object and includes mapping (*_Map), validation (*_Validate), and data enrichment (*_Enrich).
The default structure of the user interface in the SAP Data Services Designer (see Figure 11.5) comprises Project Area 1 and Local Object Library 2 on the left. The entire section on the right 3 next to the Start Page provides graphical illustrations of process flows. These illustrations map the flow of data records from the top left to the bottom right.
Figure 11.5 SAP Data Services Designer
The example in Figure 11.5 shows the Job_DM_Bank_IDoc customer job with the DF_DM_BANKHeader_Validate data flow for the bank header data. The system always displays new windows as tabs on the right-hand side.
The imported content for SAP S/4HANA has the structure shown in Figure 11.6. Jobs in SAP Data Services are organized in projects; one job can be assigned to several projects. If you modify a specific job, these modifications affect all projects. So the project name is basically only a collection of references to jobs in SAP Data Services. The same applies to all subordinate objects, such as data flows. They can be reused; modifications always affect all instances. To avoid unintentional changes, you should replicate each object and use the copy for your specific purposes.
Figure 11.6 Project Structure with Job and IDoc Segments
We’ll use the project DM_BPFDM_IDOC in our example. The project contains one job for each business object because data migration with Rapid Data Migration—as is the case for all other methods—always migrates the data of an entire business object as a logical unit and not for each SAP table.
We’ll use job Job_DM_Bank_IDoc to transfer the bank master. In general, two data flows are significant for all data migrations and will have to be processed for each IDoc segment (for example, our first segment, E1BANK_CREATE, the header segment):
- DF_DM_BANKHeader_Map (mapping data flow, see Section 11.3.6)
- DF_DM_BANKHeader_Validate (validation data flow, see Section 11.3.8)
During the mapping, the fields are mapped, while the validation data flow enables you to display the results of various data validations after they have been carried out. Finally, the DF_DM_BANKHeader_Enrich data flow also exists, which is discussed in detail later in this section. To migrate your data, this step is not that important at first because empty fields will be populated with default values regardless.
For a mapping template, we’ll create a mapping on paper. These mapping templates including content, provided for each business object, to facilitate assigning fields and values in the tool. In addition, mapping templates are an appropriate means to discuss complex field relationships with the persons responsible in the various user departments. You can immediately enrich your mapping templates with test or production data to simplify the loading without having to carry out field mappings. To make handling template files easier in the Microsoft Excel format, templates are included in the Rapid Data Migration package and are already filled with test data.
Figure 11.7 shows the IDoc target structure as an excerpt of the mapping template for the BANK_CREATE01 IDoc.
Figure 11.7 SAP S/4HANA Target System as an Excerpt from the Mapping Template
The following sections describe the individual columns of the templates with the terms used in all mapping templates (as seen in Table 11.2). The following symbols are used as abbreviations in SAP Data Services as well as in all templates:
- The asterisk (*) for mandatory fields
- The dollar sign ($) to highlight existing default values (there are no default values for bank master data)
- The plus sign (+) for fields with check tables in the SAP system
Fields with a plus sign for which value mapping is required (that is, a conversion table), in addition to field mapping generally, are discussed in detail in Section 11.3.7.
Column | Description |
---|---|
System Required | A mandatory field |
Enrichment Rule | Populated with a default value if you have not assigned a source field |
Look Up Required | A check table (lookup table) for this field (Only values from the input help (F4) are permitted) |
Text Description | A unique and detailed description |
Field Name | Field name in Rapid Data Migration content |
SAP_Table | Technical name of the table in the ABAP Dictionary |
SAP_Technical_Field_name | Technical name of the field in the ABAP Dictionary |
Field Length | Field length in the SAP target system |
Additional Instructions and Comments | Default value for fields with dollar signs |
Segment Name | Name of the IDoc segment |
Lookup Table | Check table for the field that impacts valid values when the values are mapped later on |
Table 11.2 Critical Columns on the SAP S/4HANA Target Side in the Mapping Template
The data migration content for SAP Data Services uses the IDoc interface to provide the same design and functionality for all business objects in the modeled data flows and for all mapping structures. As a result, you’ll be able to use new business objects without deep application expertise.
11.3.4 Connecting to Source Systems
Now that you’re familiar with the basic structure of the content provided, let’s take a look at the actual data migration process and the integration of the source system. First, you’ll integrate the legacy system into SAP Data Services, which will make legacy structures and metadata known to the system.
You can also integrate several source systems to SAP Data Services using different interfaces. In our example, we won’t use data from external systems; instead, we’ll provide our source data using the migration templates provided in the Rapid Data Migration content (see Section 11.3.6).
As an example, let’s first discuss briefly customer data: Let’s assume we are using a table from the legacy system database called CUSTOMERADDRESS as well as a Microsoft Excel file called Customer_Header.xls, which contains customer names from the legacy system. (Of course, direct integration using an application or by loading flat files is also possible.)
To integrate the table CUSTOMERADDRESS from the legacy system, first perform the following steps: Navigate to the Datastores tab in the Local Object Library and create a database connection by right-clicking in the empty area.
In our example, you’ll integrate the DS_LEGACY database using an ODBC interface. Each connection is provided with a subitem called Tables, which you can use to select all or a set of tables of the legacy system. This connection also makes metadata, such as field names and field lengths, known in SAP Data Services. This connection also enables you to view the existing data records in the table (see Figure 11.8).
Figure 11.8 Integrating a Table via Open Database Connectivity
Next, you’ll integrate the Microsoft Excel files by right-clicking on the Formats tab and selecting the New menu item. Next, select a specific spreadsheet and an area within the table. If the table contains column names in the first row, you can copy the metadata directly from the Microsoft Excel file by selecting the corresponding function, as shown in Figure 11.9, and confirming your selection by clicking the Import Schema button.
You can adjust the default data formats, such as varchar 255 manually if required. Be sure to use the character data type for all purely numerical values that should not be used in mathematical operations.
For integrated Microsoft Excel files, you can preview data in the same way as displaying table data records if SAP Data Services can access the file. After importing your data to SAP Data Services, you won't see any major difference, and you can use the two objects in the same way. However, one restriction applies: In SAP Data Services, database tables and flat files can be the sources as well as the targets of the data, Microsoft Excel files can only be used as a source, i.e., you cannot write to the Microsoft Excel file directly.
Figure 11.9 illustrates the integration of the Microsoft Excel file, which was assigned a name based on Customer_Header, namely, Customer_Header.xls.
Figure 11.9 Integrating Microsoft Excel Spreadsheets
[»] Formatting in Microsoft Excel
Because Microsoft Excel is a spreadsheet program and not a word processor, cells with numeric content are automatically formatted as numbers, which might result in undesired exponential notations and the loss of leading zeros. This issue occurs frequently for US ZIP codes, which sometimes start with a zero. You should therefore format the corresponding columns as text columns.
11.3.5 Data Profiling
At this point, you have integrated the metadata of two different legacy systems (table and Excel) into SAP Data Services. You can now use the Profiler, which is embedded in SAP Data Services, to find patterns in and check the quality of the data in the legacy system prior to the mapping. For data profiling, the data must be stored either in tables or in flat files.
To profile your data, in Local Object Library select the table CUSTOMERADDRESS and right-click on the table name. In the context menu that opens, select the Submit Column Profile Request function. In the example shown in Figure 11.10, the system will submits a detailed profiling request for each column when you click Submit. You can submit a new profiling request in the View Data section anytime.
Figure 11.10 Column Profiling
The results are displayed in Figure 11.11. The column profiling shows that a ZIP code may be incorrect. Of the twelve data records from the various countries, only one ZIP code contains letters instead of just numerals. This ZIP code with the value "X4352" is from a customer from Canada and would be identified even if a larger dataset was profiled.
Thus, the question now is whether a postal code in Canada is allowed to have the format X9999, that is, a letter followed by four numbers. We can answer this question by evaluating the validations embedded in SAP Data Services (see Section 11.3.8).
Figure 11.11 Column Profiling Result
In addition to the patterns of the data records used in this example, further column profiling analyses are available, for example:
-
Min
Minimum value according to lexicographical order -
Max
Maximum value according to lexicographical order -
Median
Median value -
Min string length
Shortest value -
Max string length
Longest value -
Average string length
Value with an average length -
Distincts
Number of disjoint values -
Nulls
Missing values
This analysis allows you to identify incorrect values, and experienced users can use the Max string length function to determine which values are too long.
You can also start more complex profiling requests to compare database tables with one another and find incomplete data records (data records without headers, headers without items, and so on) by analyzing their relationships.
To compare database tables, start the data profiling from the first table by right-clicking on the table name and selecting Submit Relationship Profile Request With… in the context menu. The cursor will become a crosshair, which you’ll use to select the second table (or alternatively a flat file).
In the next dialog box, you’ll click Submit to submit a relationship profile request after having confirmed or adapted the key relationship between the two tables accordingly. In our example, table CUSTOMERHEADER is integrated directly from the legacy system instead of using a Microsoft Excel file. The key is the customer number in the legacy system, IDCUST (see Figure 11.12).
Figure 11.12 Relationship Profiling
The result of the relationship profiling, as shown in Figure 11.13, indicates that 8.33% of all addresses lack a header record and 15.38% of all header data records lack an address. In our limited example of only twelve data records, one address has not been used for a long time. This address is not assigned to any customer and is stored in the legacy system as an address without any reference. Also, two customer records lack address data in the legacy system. We know that these two data records will definitively not be transferred to SAP S/4HANA because mandatory address data is missing.
SAP Data Services additionally enables you to display problematic data records. In our example, as shown in Figure 11.13, the data record for legacy customer number 100289 is problematic. You can also have the system display the missing address that the application could not find in a relational database without any problems because the header data is missing.
Figure 11.13 Relationship Profiling Result
The next steps depend on your specific case. However, experience has shown that deciding, in the legacy system, whether an address will be transferred or not and, if required, whether the data will be corrected will pay off. If the data in the legacy system is no longer up-to-date or inconsistent, not transferring the data will save you unnecessary time and effort.
11.3.6 Field Mapping
Now let’s return to bank data. So that you can compare our example to the file upload using the SAP S/4HANA migration cockpit, which we’ll describe in Section 11.4, we assume that the source data has already been cleansed. For simplification, we’ll only use two data records in our example of migrating bank data.
In contrast to direct integrations (see Section 11.3.4), we’ll use the data migration templates specifically provided for SAP S/4HANA. These Microsoft Excel files are included in the Rapid Data Migration content, which has a separate Excel sheet for every segment to be migrated and contains mapping notes, rules, and descriptions of mandatory fields.
If you use these migration templates, the entire field mapping is already implemented in SAP Data Services, provided with the test data used by SAP. If you want to integrate custom formats (tables or files), however, you’ll have to assign the SAP fields first.
In general, mapping fields is the central step in any data migration: The available fields of the source system are assigned to the predefined fields of the target system (SAP S/4HANA). In our case, analogous to mapping templates, the target system is already defined by the IDoc segments. The source system, instead, is defined by its source structures, in our example by an Excel migration template.
Let’s first take a look at mapping on paper for our two source structures, as shown in Figure 11.7 in Section 11.3.3. The left side of the table features a free area for the legacy system, not displayed in the figure, and the structure of the SAP S/4HANA system is on the right. The same concept applies to the source data in the Excel migration template. In our example, the template has already been populated with test data, as shown in Figure 11.14, which includes all four Excel sheets:
- Introduction with descriptions
- Field List with notes on mandatory fields and check tables
- Header filled with two table data records
- BankAddress for address data records (in our example, filled with two test records in relation to header records)
At first glance, you can see that all mandatory fields (*) have been populated and that no fields contain default values ($). Despite being mandatory fields, fields with default values don’t have to be mapped if you provide a constant as a global variable in SAP Data Services.
Figure 11.14 SAP S/4HANA Data Migration Template for Source Data
Usually, you’d now have to implement the mapping on paper in SAP Data Services Designer. Call the mapping view by clicking on the name of the DF_DM_BANKBankAddress_Map data flow in Project Area or by double-clicking on the icon in the parent data flow (shown in Figure 11.15). Because we are using a migration template, we can copy the entire mapping on a 1:1 basis.
Figure 11.15 Selecting the Mapping Step
If you now want to modify the mapping or insert specific sources, select and remove the placeholder for the source file and drag and drop the desired source (file or table) to the data flow workspace. By connecting the source with the Qry_BestPractices query, the source fields in the workspace are available for simplified mapping. Figure 11.16 shows the correct data flow.
Figure 11.16 Integrating the Legacy System
[»] Displaying Object Content
You can use the small magnifying glass next to the objects in the data flow to have the system display the content. Now, you can always keep an eye on individual or typical data records (via profiling, which is also possible in this display). Troubleshooting is thus much easier.
If you need more fields than what is available in the Baseline scope of SAP Best Practices, you can find them in the second query, Qry_AllFields. However, in our example, we’ll only use the simplified version that contains the fields set up using SAP Best Practices for a new SAP S/4HANA system.
Carrying out internal validations in SAP Data Services when working in the data flows or in the queries makes sense. You can then quickly identify mapping errors or inconsistent settings. Use the Validate Current button to validate the local object or Validate All
to check the syntax across all objects. Alternatively, you can select Validation • Validate from the main menu.
The section under the mapping in the mapping view displays the code for assigning the script language for SAP Data Services. (This script language does not have a name.) The code is generated here for all default functions. However, you can always modify the system-generated script or insert your own custom code.
[»] Using the Functions Provided
Use one of the numerous predefined functions and adapt the script code to meet your requirements. With predefined functions, you’ll be able to create your own conversion rules easily, without having to start from scratch every time.
If you have more than one source structure, you’ll have to define a unique key relationship between the sources. You can freely define this key relationship in the script editor or use the Propose Join function to generate the code in the WHERE condition automatically. The proposed join is based on the key relationships or on appearances of the same name in different sources, as shown in Figure 11.17.
Figure 11.17 Key Relationship in Multiple Source Structures
In our case, we’ll rely on the SAP S/4HANA migration template and simply copy the mapping. If you want to adapt the mapping, select the required fields on the left, then drag and drop them on the relevant target field to the right. Figure 11.18 shows the results. The actual mapping is now performed according to the mapping template if you don’t have to make any adaptations.
If a field has already been assigned (or has a NULL initial value), you’ll be asked during the mapping process whether you want to remap the field. To remap a field, right-click on the relevant field and select Remap Column in the context menu.
After you’ve completed the mapping and perhaps added custom code, select the validation using the Validate Current button or the Validate All button
. Alternatively, you can navigate to the Validation • Validate option in the main menu to validate the mapping. If no error message appears, you can proceed with the remaining steps. Often, however, the system will output a warning for all fields have different data types in the source system and target system. You can ignore this warning for now because the types will be automatically converted during runtime. If this conversion works for all data records (for example, converting number fields to text fields of the Character type), further phases of the process will not be affected.
Figure 11.18 Mapping via Drag and Drop
Let’s now take a closer look at the mapping and explain the individual assignments in the ETL process. Up to this point, we used a direct mapping process where fields were mapped to fields without transformations or complex rules. Compare the mapping in Figure 11.19 with the migration template from Figure 11.14; this comparison illustrates how SAP Data Services has implemented the mapping. Because the migration template uses the same names on the source and target side, the mapping is pretty straightforward.
Figure 11.19 Detailed Field Mapping
As an alternative to the field mapping, you can also assign global variables or constants in SAP Data Services. Figure 11.20 shows a constant, the ISO code for Germany ('DE'), assigned to the country field.
Figure 11.20 Mapping a Constant
[»] Resetting the Mapping
If you want to reset a field mapping that you have performed manually or via drag and drop, deleting the generated or custom code is not sufficient. When validating the data, you’ll see that empty mappings are not permitted. Instead, you’ll have to reset the code by replacing the code in the input field with NULL.
11.3.7 Value Mapping and Conversion Tables
After completing a basic mapping in the central SAP Data Services Designer step, you can convert the remaining values, that is, perform a value mapping, for country (BANK_COUNTRY_KEY) and region (REGION). The region involves a multilevel conversion because this field also depends on the country and becomes unique when both region and the country are converted.
For value conversions, you’ll use the Migration Services tool, which is provided in the Rapid Data Migration content. This tool can access the staging area where all SAP check tables have been replicated from your connected SAP S/4HANA system (see Figure 11.21). The Migration Services tool enables you to assign the correct values from the legacy system to the SAP-generated values, such as the ISO code for a country.
You can only change the column for legacy data in the Migration Services tool. The side for the SAP S/4HANA system corresponds to the Customizing in SAP S/4HANA and cannot be changed. Thus, the tool is a conversion table, similar to the table from Chapter 7, Section 7.3, in the SAP S/4HANA migration cockpit.
You should perform an initial job run before you start mapping the values, even before the actual mapping is complete. In this initial run, the internal number ranges and required buffer tables are initialized. In addition, the specifications for various values are collected in the legacy system for all fields whose values will be converted (lookup fields). This process easily identifies values that still lack data. The reason for the initial job run is that—despite the previous profiling process—you still don’t know all the specifications of the values from the legacy system and thus cannot provide data for your existing specifications in the mapping process.
Figure 11.21 Lookup Check Tables in Migration Services
To start a data migration job, right-click on the relevant node in Project Area (in our example shown in Figure 11.22, Job_DM_Bank_IDoc). Next, press Execute….
In a popup, SAP Data Services will let you specify parameters for the job run. Navigate to the Global Variable tab. For this run, be sure to set the value of the global variable $G_ProfileMapTables to 'Y' for “Yes.” With this selection, during the run, values from the legacy systems will be collected. This tab also provides an overview of the global variables that will be used to prepopulate the $ fields later on (as shown in Figure 11.23). You can change these values in the job run or define and store these values as a characteristic for all runs of a job by right-clicking on the context menu and selecting Properties…. Besides prepopulating data, you can also determine the IDoc message type and the so-called SAP partner system (the technical name of SAP Data Services as the sending system and of SAP S/4HANA as the receiving system). Later, SAP Data Services will write these values to the control record of the IDoc before sending the control record to SAP S/4HANA.
Figure 11.22 Executing a Job in SAP Data Services
Figure 11.23 Global Variables for the Job Run
The result of this job run is a log that, ideally, display no errors (as shown in Figure 11.24).
Figure 11.24 Information Messages during the Job Run in the Log
In the Monitor tab, you can monitor and stop the job runs (see Figure 11.25). The green traffic light symbol indicates that the job is still running; the red traffic light symbol indicates that the job has been completed. However, this indicator is independent of the actual status of the job, that is, the indicator does not tell you if the job has been completed successfully or terminated prematurely. If the job has been terminated, the log will display a red button, as shown in Figure 11.26. Click this button to display details about the error.
During this initial job run, SAP Data Services will “learn” which field values exist in the legacy data. You can perform this test run with a subset of the data that you want to convert or with all the data you have available.
Figure 11.25 Job Monitor in the Project Area
Figure 11.26 Cancelation and Error Messages in the Log
Let’s return to our example with the country codes: If your legacy system includes the countries in plain text notations and does not use standard codes, you still might need different specifications for the same country to correct inconsistencies or typos. In the SAP system, with value mapping, all these fields will be merged into one value, i.e., the correct ISO code for the country, as shown in Table 11.3.
Value in the Legacy System | ISO Code in the SAP System |
---|---|
Deutschland | DE |
Deutschlnd | DE |
BRD | DE |
USA | US |
U.S.A | US |
Table 11.3 Sample Value Mapping for the Country Field
In our example, we’ll use the migration template from Figure 11.14 to obtain the necessary field values for existing countries and regions. You can then maintain these field values manually in Migration Services. Because you performed an initial job run after mapping out these fields, they will be available in the tool. Thus, value mappings are as simple as field mappings. In Migration Services, you’ll now assign the collected values to the relevant SAP values via drag and drop (as shown in Figure 11.27) or by selecting from a dropdown list (as shown in Figure 11.28). A searchable help function is also available (see Figure 11.30).
Figure 11.27 Value Mapping Using Drag and Drop
The value mapping status is indicated by traffic light symbols in Migration Services:
- Green square: mapped
- Yellow triangle: not mapped yet
- Red circle: mapped twice/not mapped uniquely
Figure 11.28 Assigning Various Legacy Values Manually
Figure 11.27 and Figure 11.29 show some examples of value mapping statuses. With these symbols, you’ll be able to easily track progress during the value conversion. Now the data is ready for validation.
Figure 11.29 Status of Value Mapping in Migration Services
Figure 11.30 Search Help for Values in Migration Services
11.3.8 Data Validation
After successfully mapping values using Migration Services, let’s now turn to data validations in the SAP Data Services Designer.
SAP Data Services does not import data records to SAP S/4HANA that have not passed all of the three following validations (see Figure 11.31):
-
Validation using check tables (Validate_Lookups)
For all fields marked with a plus sign (+), the values are compared to the values in the SAP check tables. Only legacy values that were previously converted to a lookup value in Migration Services will pass this test. -
Validation of mandatory fields (Validate_Mandatory_Columns)
For all fields marked with an asterisk (*), values must exist (NOT NULL). -
Validation of the format (Validate_Format)
This validation is carried out for all fields subject to format checks in SAP S/4HANA if the data migration content provides for these format checks. The system will check, for example, for correct field lengths in SAP S/4HANA (see our earlier example regarding the length of the material number field in Chapter 10, Section 10.2.2) or check that ZIP codes follow the correct syntax.Figure 11.31 Data Flow for Validations in SAP Data Services
To execute the check routines, you’ll run the Job_DM_Bank_IDoc job again. However, this run will not send any IDocs to the SAP S/4HANA system—a dry run that you can carry out as many times as required until satisfied with validation results.
Validations are not carried out successively but in parallel, which means that all fields will undergo all validations and that the validations do not terminate when an error occurs, as is the case with many other methods. Also, data records can fail several validations at once. For example, if the country of a bank (*+BANK_COUNTRY_KEY, a mandatory field) is not populated, the field will fail validation twice. First, this mandatory field is not populated (Validate_Mandatory_Columns), and second, the requirement that the value be converted (Validate_Lookups) has not been met. According to the data flow shown in Figure 11.31, the value will receive a Fail status and will end up in the Invalid area twice.
You can use the small magnifying glass at the end point (see Figure 11.32) to display failed records in SAP Data Services Designer and identify the cause. So that values that received a Pass are not displayed twice in the Valid area, you can use the SELECT DISTINCT statement for data records that have passed the validation. Only valid data can be further processed and populated with default values or imported via IDocs.
Figure 11.32 Incorrect Data Records after the Test Run
[»] Embedded Validation Functions
The Rapid Data Migration content not only provides data flows and mappings, it also contains validation functions, such as a function for ZIP code validations.
In Section 11.3.5, we noticed a data record with a Canadian postal code, which would not pass the format validation: X4352 is not a valid postal code for Canada. Actually, Canadian postal codes have a much more complex structure. Instead of the format X9999 (1 letter—followed by 4 numbers), the syntax is X9X 9X9, which also implies that the postal code has 6 digits instead of 5. This validation function is an ideal example of a function available with SAP Best Practices and embedded into SAP Data Services.
The code for the ZIP code validation function is written in the specific SAP Data Services script language, which can be extended to any country. The code is based on the Backus-Naur Form (BNF)—a metalanguage for grammar. Figure 11.33 shows the editor and the provided code for ZIP code validations.
Figure 11.33 Function Editor in SAP Data Services
Note, however, that this validation only checks the syntax and does not include a plausibility check for the ZIP code. While validating a concrete ZIP code with the corresponding city and street can technically be implemented easily in SAP Data Services using the Data Quality function, for plausibility checks, you’ll have integrate fee-based databases provided by local postal operators that must be updated continuously.
When errors exist, how do you proceed? In general, you can discard and exclude from the migration all data that was filtered by SAP Data Services. In this case, you won’t have to do anything. However, correcting all identified inconsistencies in the legacy system makes more sense because doing so additionally improves your data quality. After you have cleansed your data records, they are loaded again for the next run and pass the validations. This procedure is iterative: You can repeat the process until only discarded data records are caught in SAP Data Services. These data records won’t be migrated to SAP S/4HANA.
11.3.9 Importing Data
After successfully importing, converting, transforming, and validating your data records, you can perform the next step: loading the IDocs to the SAP S/4HANA system. In this context, be aware that the source data is always extracted from the data sources for each job run, which means that you won’t work with data temporarily stored in SAP Data Services but instead with current values from the migration template or other source files or databases you use.
Before you can import IDocs into the SAP S/4HANA system, you’ll have to modify the connection to your SAP S/4HANA system in SAP Data Services. The DS_SAP SAP datastore (Local Object Library • Datastores) is provided with a dummy connection only. To integrate your SAP S/4HANA system, you’ll need to gather various system information.
Table 11.4 compares the terms used in the SAP S/4HANA world to the names used in SAP Data Services for configuring connections.
SAP S/4HANA | SAP Data Services | Example |
---|---|---|
Application server | Application server | myserver01.me.com total |
Instance number | System number | 00 |
System ID | – | PRD |
Client | Client number | 100 |
User | User name | – |
Password | Password | – |
Table 11.4 Differences in the Names for the Configuration
Figure 11.34 shows the window for entering the SAP S/4HANA system parameters. You can also find information on the configuration in the Configuration Guide of the Rapid Data Migration package.
To actually load IDocs in the next run, you’ll have to change the default value of a critical global variable, $G_GenerateIDOC_Req. If you change this value to 'Y', the system will not only carry out the test run but also will also set up the IDocs in SAP Data Services and transfer them to the SAP S/4HANA system via Remote Function Calls (RFCs).
Figure 11.34 Sample Configuration of the DS_SAP Datastore
Alternatively, you can also store IDocs in local files, which may be necessary, for example, if you cannot integrate the SAP S/4HANA system or if the system is not yet available. In this case, you’ll need to transfer the files to the SAP application server in a separate FTP process.
Figure 11.35 shows the popup dialog for confirming the update run. The BANK_CREATE IDocs of the BANK_CREATE01 basic type are sent to the SAP S/4HANA system with the RDE ID and Client 181. The global variables maintained here will be written to the IDoc control record, which serves as an “envelope” for the IDoc.
If the job has successfully completed, the IDocs were sent to the SAP S/4HANA system and updated there—provided the inbound IDoc is set correctly. The Rapid Data Migration Configuration Guide describes in detail the settings for IDoc Customizing in your system. You can download the guide from the SAP Best Practices Explorer and the SAP Note mentioned there.
Figure 11.35 Global Variables for Sending IDocs
You can now log on to the SAP S/4HANA system, use the IDoc monitor there, or have the system display our newly created banks in a corresponding application. The following section explains how you can implement this much more elegantly, without leaving SAP Data Services.
11.3.10 Monitoring
Integrating SAP Data Services with SAP BusinessObjects Business Intelligence (BI) allows for integrating Business Analytics reports in an easy way. The data migration content provided contains predefined Web Intelligence reports for monitoring data migration projects (see Figure 11.36).
You can either use these reports as templates and adapt them or use them without any modifications.
Figure 11.36 Web Intelligence Reports to Indicate Missing Data Records
Similar to the Migration Services tool, you’ll access these reports using a web browser. The BI Launchpad (see Figure 11.37) even enables functional users that were not involved in the data migration or system setup to access and create these reports.
Figure 11.37 BI Launchpad in SAP BusinessObjects BI Platform
Usually, involving functional user departments makes sense because legacy system users will often have the required know-how to use the data records and to troubleshoot. You can analyze, update, change (without additional software), and print Web Intelligence reports. You should update the data of the report after every job run.
In addition to reports for validating business objects and IDoc segments, numerous predefined evaluations are available for mass data uploads. In our example with two banks, reporting may not play a major role, and the magnifying glass in the data flows in the SAP Data Services Designer to view data records may be sufficient. However, if you have large data volumes, you’ll appreciate these reports.
In the DM_BPFDM_Reconciliation Rapid Data Migration project, which is included in the SAP Best Practices package, you can execute the Job_DM_CheckIDocStatus job in SAP Data Services to get an IDoc monitoring tool for the migration. This monitor lets you view the status of the IDocs without having to log on to the SAP S/4HANA system. The evaluation contains information similar to the information provided by the IDoc monitors found in Transactions WE02/WE05 or BD87. You can monitor the correct update of IDocs in SAP Data Services or with an available Web Intelligence report.
If the IDoc monitor indicates that the legacy data was successfully validated with regard to the Customizing replicated from the SAP S/4HANA system and the IDocs were loaded without any errors, the data migration project for the bank master is nearly complete. However, after the successful load process, whether the data was actually imported to the SAP S/4HANA system as expected should be examined.
You can usually only answer this question by performing extensive tests. However, using the content provided in the DM_BPFDM_Reconciliation project, you can compare the expected data to the current data existing in the SAP S/4HANA system with Job_DM_Reconcile. This assessment is quite helpful because relationships to data that had not been included previously could lead to unexpected results when the data is updated in the SAP S/4HANA system. You can also call the result of this job run using BI Launchpad.
11.3.11 Optimizing IDoc Performance
This section provides information on how to use IDoc technology efficiently.
The default setting in the ALE partner agreements in SAP S/4HANA (Transaction WE20) is Trigger Immediately, which results in the nearly synchronous processing of the IDoc after receipt. However, because this processing constitutes a separate work process for each IDoc, this procedure is not always ideal, for example, for large data volumes where resources would quickly bottleneck. An alternative is to use background processing, which is triggered by the RBDAPP01 background program.
To achieve high performance when mass uploading IDocs for data migrations, background processing is mandatory. Only in this way can you transfer several IDocs as packages to a work process for processing in parallel. Ideally, you’ll directly use multiple SAP work processes that each process a package of IDocs, which will decouple the receiving process from the processing of the IDoc. In direct processing, a single IDoc is received via an RFC and processed. However, if you collect inbound IDocs using the Trigger by Background Program setting in Transaction WE20, you can improve the update performance for IDocs. Basically, triggering background processing means that the IDocs first wait in Status 64 (see IDoc status values in Section 11.3.3) for processing. Background processing is not the only available option; you can also start the processing in a dialog. The best way to process IDocs is to schedule the RBDAPP01 report as an ABAP job in the background using Transaction SM36 or start the report immediately using Transaction SE38 (see Figure 11.38).
Figure 11.38 Parallel Processing in Report RBDAPP01
The following settings in the selection screen of program RBDAPP01 are critical:
-
Doc Selection tab
- Package Size: The package size controls the maximum number of IDocs to be processed in a logical unit of work (LUW) in a dialog work process. A large package reduces the number of required processes to a minimum but also requires a large roll area. Either the database commit is performed for the entire package, or the database is rolled back and no package data is stored.
-
Parallel Processing tab
- Parallel Proc. Enabled: This switch activates parallel processing. If you select this checkbox, the application server uses a free dialog process for each IDoc package for inbound processing; the packages are thus processed in parallel. If you select too many packages, all dialog processes of the server will be occupied. You should therefore specify a server group that controls how work processes are assigned (for example, parallel_generators) to avoid overloading the system. If the indicator is not selected, IDocs will not be processed in parallel. Instead, each package will be transferred to the application sequentially. In total, only one work process will be occupied on the application server.
- Server Group: The server group determines how resources are distributed across the existing work processes of the application server(s), that is, how many work processes are provided in each application server. You can make the corresponding settings in Transaction RZ12 (shown in Figure 11.39).
We can’t make general recommendations about package size and number of work processes in a server group for parallel processing. You should always use test data to determine the best values for you. This value depends on the IDoc types and IDoc sizes (number of the segments) as well as database performance and server performance. For large IDoc segments, 50 IDocs per package and, for smaller segments, 100 IDocs per package are good starting values.
Figure 11.39 Server Group Maintenance in Transaction RZ12
[»] Rapid Data Migration Demo Video
You can find a demo of a data migration to SAP S/4HANA, on-premise, using Rapid Data Migration, as well as further information, on the SAP YouTube channel, SAP Digital Business Services, at the following links: