PQL and Python (Advanced)¶

In [1]:

            
                Copied!
                
#To get a copy of this notebook in your current working dirtectory, run:
from pycelonis.notebooks import api_tutorial
#To get a copy of this notebook in your current working dirtectory, run:
from pycelonis.notebooks import api_tutorial

1. Connect to Celonis¶

In [2]:

            
                Copied!
                
from pycelonis import get_celonis
celonis = get_celonis()
from pycelonis import get_celonis
celonis = get_celonis()

2. Check available datamodels¶

In [3]:

            
                Copied!
                
celonis.datamodels
celonis.datamodels

Out[3]:

[<Datamodel, id e6d3cf0e-e669-4c60-ba94-bb620866e1e2, name SAP Accounts Payable>,]

3. Select the datamodel using the name or id as listed above¶

In [4]:

            
                Copied!
                
datamodel = celonis.datamodels.find('SAP Accounts Payable')
datamodel = celonis.datamodels.find('SAP Accounts Payable')

4. Interaction with the Celonis resource object¶

List all the tables in the datamodel

In [5]:

            
                Copied!
                
datamodel.tables
datamodel.tables

Out[5]:

[<DatamodelTable, id 010ae6aa-cdfe-4f7c-8fc5-6ebd7f541c91, name LFA1>,
<DatamodelTable, id 288fa4fb-3d27-4a54-bbdf-0e1a4d2af718, name BKPF>,
<DatamodelTable, id c8385174-ff3e-424b-b388-a06bf15566f2, name BSEG>,
<DatamodelTable, id cf14ce18-3a2f-4cc7-ba3c-ba2525f9c611, name CEL_AP_ACTIVITIES>,]

4.1. List all the columns of a Data Model Table¶

In [6]:

            
                Copied!
                
datamodel.tables.find('CEL_AP_ACTIVITIES').columns
datamodel.tables.find('CEL_AP_ACTIVITIES').columns

Out[6]:

[{'name': '_CASE_KEY', 'length': 320, 'type': 'STRING'},
 {'name': 'ACTIVITY_EN', 'length': 320, 'type': 'STRING'},
 {'name': 'EVENTTIME', 'length': 26, 'type': 'DATE'},
 {'name': '_SORTING', 'length': 19, 'type': 'INTEGER'},
 {'name': 'USER_TYPE', 'length': 320, 'type': 'STRING'},
 {'name': 'ACTIVITY_DE', 'length': 320, 'type': 'STRING'},
 {'name': '_CELONIS_CHANGE_DATE', 'length': 26, 'type': 'DATE'}]

4.2. List the data in a column¶

List the data in the ACTIVITY_EN Column, we can do that using a PQL query

In [7]:

            
                Copied!
                
from pycelonis.celonis_api.pql.pql import PQLColumn

query = PQLColumn(query = "CEL_AP_ACTIVITIES.ACTIVITY_EN", name = "ACTIVITY_EN")
activity_column = datamodel._get_data_frame(query)
activity_column.head()
from pycelonis.celonis_api.pql.pql import PQLColumn

query = PQLColumn(query = "CEL_AP_ACTIVITIES.ACTIVITY_EN", name = "ACTIVITY_EN")
activity_column = datamodel._get_data_frame(query)
activity_column.head()

Out[7]:

	ACTIVITY_EN
0	Vendor Creates Invoice
1	Scan Invoice
2	Create Invoice Document
3	Book Invoice
4	Clear Invoice

4.3. Custom PQL Query¶

Now we want to filter the VBEP table columns GJAHR and WRBTR (year and value), only based on the cases where activity Book Invoice is followed by Cancel Invoice Receipt This is a very simple query in PQL but hard to filter this other wise.First we add the columns in the query and then we apply the filter for activities

In [8]:

            
                Copied!
                
                    
                    
                
                

        
from pycelonis.celonis_api.pql.pql import PQL, PQLFilter

query = PQL()
query += PQLColumn("BKPF.GJAHR", "GJAHR")
query += PQLColumn("BSEG.WRBTR", "WRBTR")
query += PQLFilter("FILTER DOMAIN MATCH_PROCESS(NODE['Book Invoice'] \
                    as src,NODE['Cancel Invoice Receipt']  \
                    as trg CONNECTED BY EVENTUALLY [ src, trg ]) = 1")
filtered_results = datamodel._get_data_frame(query)
filtered_results.head()
from pycelonis.celonis_api.pql.pql import PQL, PQLFilter

query = PQL()
query += PQLColumn("BKPF.GJAHR", "GJAHR")
query += PQLColumn("BSEG.WRBTR", "WRBTR")
query += PQLFilter("FILTER DOMAIN MATCH_PROCESS(NODE['Book Invoice'] \
                    as src,NODE['Cancel Invoice Receipt']  \
                    as trg CONNECTED BY EVENTUALLY [ src, trg ]) = 1")
filtered_results = datamodel._get_data_frame(query)
filtered_results.head()

Out[8]:

	GJAHR	WRBTR
0	2009_	20.899041
1	2009_	5463.420244
2	2009_	4463.603969
3	2009_	4153.407804
4	2009_	5166.159140

Now we can apply all the processing in python we want on this filtered data frame and then easily push the results back to the data model in IBC