Wednesday, 2 April 2014

Webi on HANA

What is SAP HANA?

SAP HANA is a super-fast in-memory database enabling new possibilities in terms of analytical reporting and real-time data acquisition and consumption.

Is SAP HANA really real-time?

The term “Real-time” can be ambiguous. Let’s say SAP HANA can be super-fast, depending on which context.

Analytical reporting: HANA is designed for on-line analysis. It can calculate and execute millions of data at sub-second speed. Of course it will depend on other factors like creating an optimal HANA model (Analytical Views) and the type of queries you’re sending to HANA. Also bear in mind that HANA will execute queries very quickly but it does not have the control over the amount of data it is fetching (see question on Bad Design).

Data acquisition: HANA is shipped with ETL and replication tools that can load data from operational systems to HANA in a matter of seconds.

Data consumption: HANA solves the issue of having to maintain aggregate tables for performance (data is fresh and live, no need to wait for the next ETL batch).

What is SAP BOBJ Web Intelligence?

Web Intelligence (WebI) is an interactive reporting tool that can access relational as well as multidimensional datasources via the concept of Universes. The reports can be viewed online or offline thanks to the microcube, an embedded local in-memory cache engine. In case of Hana, WebI will access it through relational access.

What are Universes?

A Universe is a metadata layer that allows Designers to map to the underlying data sources into familiar business terms and optimize the queries access to the database (in this FAQ the database would be SAP HANA). This allows business users and analysts to create, access, analyze and share business content easily across the enterprise. Also referred as "Semantic Layer", the Universes allow IT to control access to the data and guarantee its integrity and validity.

Is SAP Web Intelligence really real-time ?

In order to display data coming from HANA into Web Intelligence, you will need first to execute the query on the HANA side, transfer the data through the network and load data into the local cube, execute some calculations and display these data into Web Intelligence. Depending of the data volume these actions can take from less than 1 second to much more time.

Why Web Intelligence is using a microcube / local cube ?

The micro cube is mandatory and an important part of the Webi architecture design. Having a microcube has a lot advantages when designing and consuming a Web Intelligence reports, and it has been designed for a vast majority of databases. The main benefit is that it can offer offline analytical capabilities like report viewing, drilling, quick filtering, local calculations:
  •   The WebI calculation compensates missing data source expressiveness (cross table, multi-context evaluation, advanced functions like Previous, …) . It requires to fetch raw data first before performing local data processing
  •   It enables Multi-data Provider support, which requires local data materialization for data synchronisation
  •   It enables data historization (if not available from the data source) which allows users to create data snapshots, and make features like Track Data Change possible.

Moreover it can avoid unnecessary round trips to the database and be very performing if the report is well designed and sized (remember microcube is in local memory). In the context of HANA, designing a Web Intelligence is a matter of balance between performance and functionality (for example Track Data Change will not available if you are using Query on Drill), but the idea is to have the smallest subset of data loaded in the microcube and to push down all the heavy calculations down to HANA.

What is a typical use case for Web Intelligence with SAP HANA ?

In order to best leverage Web Intelligence with SAP HANA, you need to design reports that require intensive aggregation and calculation on huge amount of data that will be processed almost instantly by HANA. Another example is to create analytical reports that can answer a user’s business questions upon live requests, for example a drill-down report, going from summary data to detailed data. In this scenario, SAP HANA will aggregate the results on-the-fly after each user drill interaction, and WebIntelligence will only retrieve the results needed in the report, thus leading to a smaller microcube. You can leverage these types of workflows using the "Query on Drill" feature in WebIntelligence.

How can I best leverage SAP HANA with Web Intelligence

In order to leverage the in-memory capabilities of SAP HANA, all the calculations and aggregations have to be done in SAP HANA for high performance analytical reporting. In this scenario, Web Intelligence should avoid pre-loading too much data into local cache (microcube) as this could cause performance issue due to too much data crossing the network, loading the data into the microcube and displaying the data into Webi. Web Intelligence only needs to retrieve the results calculated by HANA, needed for the report.

What is the best BI tool to leverage HANA

SAP HANA doesn't change the positioning of SAP BI4 tools. You still need to choose the right tool for the right job.

What is the most performing way to access SAP HANA from Web Intelligence.

Web Intelligence can only access SAP HANA through a universe. The most optimal way is to create HANA information views (Analytical View or Calculation Views) and to map them directly into BI4's Information Design Tool. See link "Create Universe on HANA best practices".
For more information, please refer to this must-read document  “Creating a Universe on SAP HANA Best practiceshttp://scn.sap.com/docs/DOC-23256

My Web Intelligence report is connected to SAP HANA and I don't see much performance difference with my previous database.

The value of SAP HANA can be highlighted in analytical workflows. Its performances are outstanding in processing calculations and aggregations. If you're trying to refresh a static Webi report with loads of data, some or insignificant performance gains could be noticed in the query execution phase, but the fetching phase will depend on other factors like the network transfer, the loading of data into the microcube or the rendering of this data into WebI. Just by replacing your old database by SAP HANA will not lead to sub-second performance. You still have to implement BI reporting best practices (e.g., query on drill, query stripping etc…) and Universes/HANA views design best practices in order to fully leverage the power of SAP HANA.

How to quickly analyze the performance of Web Intelligence on HANA?

The performance can be broke down into 3 different parts (HANA Database, Network, and Web Intelligence). The quick and dirty way to check the performance is to install HANA studio on the same box as the BI4.0 system and to compare the query performance using these two tools. Use advanced monitoring tools like Solution Manager / Wily Introscope for a deeper performance investigation.


Time spent in Hana
You have to retrieve the SQL generated into Webi and paste it into HANA studio and run it. In HANA studio you will see the time to execute the request into HANA and the fetching phase to retrieve into HANA studio. Check the execution phase while you execute the query in HANA studio.
Time spent in the Network time
You need to change the “max displayed rows in result” setting in HANA Studio (maximum is 99999 rows).
Check the fetching phase while you execute the query in HANA studio.
Time spent in Webi
Check the data manager in WebIntelligence (includes execution and fetchingphase performance but only shows seconds)

What is Query Stripping?

Query Stripping is an automated way to remove ("strip") unnecessary objects from a Web Intelligence query, for example if a report doesn't need those objects to display the correct results. This feature avoids sending too big queries to the database and reduces the size of data transferring across the network.
As of today, this feature is only available for OLAP datasources. Query Stripping is available for HANA and relational universes in BI4.1. However report designer can a similar technique by simply removing objects from the Query Panel that are not used in the report, thus simulating the Query Stripping behavior.

What is Query on Drill and how to use it?

The Query on Drill will send a new query to the database each time a user performs a drill action. As SAP HANA executes queries very fast, the user has instant access to "live" data each time he drills down or up.  This feature helps reducing the amount of data going through the network by pushing down the calculation to SAP HANA and retrieving only the results into the microcube.
querydrill.jpg
On the contrary, by using the default "Scope of Analysis" drill function, the drillable data needs to pre-fetched into the microcube. If a dimension (or characteristic) is not in the scope of analysis (aka not in the microcube), WebIntelligence sends a new query to the database. This new “out of scope” query can lead to some performance degradation in traditional databases if the data volume to be aggregated is huge.  This is why when using traditional databases, DBA's need to create aggregate tables in order to provide acceptable query performance for their end users. The main inconvenient of pre-aggregating data are the maintenance costs, the lack of freshness of data (e.g., some customers perform ETL to update their aggregates once a week or more) and the lack of flexibility (the queries need to be anticipated, and if a user requests data that are not pre-aggregated, this can lead to long waiting times)
With SAP HANA, this technique retrieves each drill step with sub-second performance, regardless of the data volume.


Bad Design - What you should avoid doing using Web Intelligence with any database including SAP HANA ?


Also please note that creating a detailed report containing fine grained data, like a 500-pages invoice or a big detailed operational report, is not mind-blowing use case for HANA. You won’t leverage much of the analytical capabilities of HANA. 
As for any other databases, wide open queries (e.g., select * from Analytic View) are also to be avoided and safety belts should be implemented at universe and HANA level to avoid runaway queries.

And remember that bad design translates to bad performance.
Always remember, when designing BI reports, with any DB fast or slow, that bad design leads to bad performance. Make sure you design your HANA model for optimal performance. Make your BI4 reports answer relevant business questions and avoid querying unnecessary data that you put too much load to the database and network. Do not use BI tools as a data extraction tool!

No comments:

Post a Comment