小编：Most Hana Adapters do support realtime push of changes. Not only for databases as sources but everything, e.g. Twitter or adapters you wrote. While the documentation contains all the information needed from an Adapter developer and user p
Most Hana Adapters do support realtime push of changes. Not only for databases as sources but everything, e.g. Twitter or adapters you wrote. While the documentation contains all the information needed from an Adapter developer and user perspective, I’d like to show the internals that might be helpful.
As shown in the blog entry about adapters and their architecture (see Hana Smart Data Integration – Adapters) and the Adapter SDK manual (SAP HANA Data Provisioning Adapter SDK – SAP Library), Adapters provide an interface to interact with the Hana database that revolve about remote sources (=connection to the remote source system) and virtual tables (=the structure of remote information).
For realtime the remote subscription is the central Hana object.
The syntax for this command is quite self explanatory:
create remote subscription using (select * from where …) target [table | task | procedure] ;
Hana will send the passed SQL select of that command to the Smart Data Access layer and depending on the capabilities of the adapter, as much as possible is passed to the Adapter. The Adapter can do whatever it takes to get changes, send them to Hana and there the change rows are put either into a target table, a target task (=transformation) or a target stored procedure.
The remote subscription object contains all of this information, as a query on the catalog object shows:
select * from remote_subscriptions;
As seen from the Adapter, above create remote subscription command does nothing. All it does is basic logical validations like checking if such virtual table exists, if the SQL is simple enough for being pushed to the adapter, if the target object exists and the selected columns match the target structure. All checks performed on metadata Hana has already.
Replicating a table consists of two parts, the initial load to get all current data into the target tables and then applying the changes onward. But in what order?
The usual answer is to set the source system to read only, then perform the initial load, then activate the change data processing and allow users to modify data after. As the initial load can take hours, such down time is not very appreciated. Therefore we designed the remote subscription activation to support two phases.
First phase is initiated with the command
alter remote subscription queue;
With this command the Adapter will be notified to start capturing changes in the source. The adapter gets all the information required for that, a Hana connection to use, the SQL select so it knows the table, columns and potential filters. The only thing the adapter is required to do is to add a BeginMarker row into the stream of rows being sent. This is a single line of code in the Adapter and it will tell the Hana receiver that at this point the Adapter started to produce changes for this table.
The adapter does replicate CUSTOMER already, now above remote-subscription-queue command was issued for a subscription using the remote table REGION. Such stream of changes might look like
The Hana server will take all incoming change rows and process them normally, rows for subscriptions that are in queue mode only, that is a BeginMarker was found in the stream but no EndMarker yet, will be queued on the Hana server. In above example, the CUSTOMER rows end up normally in its target table, the target table for REGION rows will remain empty for now.
Therefore the initial load can be started and it does not have to worry about changes that happened. From the looks of the initial load, the target table is empty and not a single change will be loaded.
Once the initial load is finished, the command
alter remote subscription distribute;
should be executed. This will tell the Adapter to add a EndMarker into the stream of data.
When the Hana server finds such EndMarker row it starts to empty the queue and apply the changes to the target table. All rows between the Begin- EndMarker for the given table are loaded carefully, as it is unknown if those had been covered by the initial load already or haven’t. Technically that is quite simple, the insert/update rows are loaded with an upsert command, hence either inserted if the initial load did not find them or updated if already present. Rows of the ChangeType Delete are deleted of course.
All rows after the EndMarker are processed normally.
During operation various errors can happen. The Adapter has a problem with the source and does raise an exception. The Adapter or the Agent itself dies. The network connection between Hana and Agent is interrupted. Hana was shutdown….
In all these cases the issue is logged as exception in a Hana catalog table.
select * from remote_subscription_exceptions;
In above instance the connection between the Agent and Hana was interrupted. Therefore the adapter got a close() call and should have cleaned up everything. Brought in a state where nothing is active anymore. On the Hana side an remote subscription exception is shown with an EXCEPTION_OID. Using this unique row number the exception can be cleared and the connection re-established, using the command
process remote subscription exception 42 retry;