Interaction between IBC data ML Workbench: PQL from Python Code

Establishing a Connection

First import the relevant libraries and connect to Celonis

[2]:
from pycelonis import get_celonis
from pycelonis.pql import PQL, PQLColumn, PQLFilter

celonis = get_celonis()
2020-04-25 09:19:23 - pycelonis: Login successful! Hello Application Key, this key currently has access to 0 analyses.

Check the datamodels available to us

[3]:
celonis.datamodels
[3]:
[<Datamodel, id e6d3cf0e-e669-4c60-ba94-bb620866e1e2, name SAP Accounts Payable>,]

Select the datamodel using the name or id as listed above (we will use the name)

[4]:
datamodel = celonis.datamodels.find('SAP Accounts Payable')

Interaction with the IBC object

List all the tables in the datamodel

[5]:
datamodel.tables
[5]:
[<DatamodelTable, id 010ae6aa-cdfe-4f7c-8fc5-6ebd7f541c91, name LFA1>,
<DatamodelTable, id 288fa4fb-3d27-4a54-bbdf-0e1a4d2af718, name BKPF>,
<DatamodelTable, id c8385174-ff3e-424b-b388-a06bf15566f2, name BSEG>,
<DatamodelTable, id cf14ce18-3a2f-4cc7-ba3c-ba2525f9c611, name CEL_AP_ACTIVITIES>,]

**List all the columns in the _CEL_O2C_ACTIVITIES table**

[5]:
datamodel.tables.find('CEL_AP_ACTIVITIES').columns
[5]:
[{'name': '_CASE_KEY', 'length': 320, 'type': 'STRING'},
 {'name': 'ACTIVITY_EN', 'length': 320, 'type': 'STRING'},
 {'name': 'EVENTTIME', 'length': 26, 'type': 'DATE'},
 {'name': '_SORTING', 'length': 19, 'type': 'INTEGER'},
 {'name': 'USER_TYPE', 'length': 320, 'type': 'STRING'},
 {'name': 'ACTIVITY_DE', 'length': 320, 'type': 'STRING'},
 {'name': '_CELONIS_CHANGE_DATE', 'length': 26, 'type': 'DATE'}]

List the data in the ACTIVITY_EN Column, we can do that using a PQL query

[6]:
query = PQLColumn(query = "CEL_AP_ACTIVITIES.ACTIVITY_EN", name = "ACTIVITY_EN")
activity_column = datamodel._get_data_frame(query)
activity_column.head()
[6]:
ACTIVITY_EN
0 Vendor Creates Invoice
1 Scan Invoice
2 Create Invoice Document
3 Book Invoice
4 Clear Invoice

Now we want to filter the VBEP table columns GJAHR and WRBTR (year and value), only based on the cases where activity Book Invoice is followed by Cancel Invoice Receipt This is a very simple query in PQL but hard to filter this other wise.First we add the columns in the query and then we apply the filter for activities

[7]:
query = PQL()
query += PQLColumn("BKPF.GJAHR", "GJAHR")
query += PQLColumn("BSEG.WRBTR", "WRBTR")
query += PQLFilter("FILTER DOMAIN MATCH_PROCESS(NODE['Book Invoice'] \
                    as src,NODE['Cancel Invoice Receipt']  \
                    as trg CONNECTED BY EVENTUALLY [ src, trg ]) = 1")
filtered_results = datamodel._get_data_frame(query)
filtered_results.head()
[7]:
GJAHR WRBTR
0 2009_ 20.899041
1 2009_ 5463.420244
2 2009_ 4463.603969
3 2009_ 4153.407804
4 2009_ 5166.159140

Now we can apply all the processing in python we want on this filtered data frame and then easily push the results back to the data model in IBC

……

Pushing the new/processed data back

[ ]:
datamodel.push_table(filtered_results,"processed_data",reload_datamodel=False, if_exists = 'replace')