Introduction¶
In this tutorial, you will learn how to use PyCelonis to perform basic interactions with Celonis Studio. More specifically, you will learn:
- What Celonis Studio is and how its general workflow is structured
- How to create spaces, packages, and variables in Studio
- How to add Studio assets, such as analyses or knowledge models, into the package
- How to publish packages to make them available in Celonis Apps
- How to update packages and how to specify a new version number
Prerequisites¶
To follow this tutorial, you should have PyCelonis installed and should know how to perform basic interactions with PyCelonis objects. If you don't know how to do this, please complete the Celonis Basics tutorial first. Further, it would be helpful to already have a data model inside your EMS. Please refer to the Data Push tutorial for an overview how to push data into the EMS and create a data model out of it.
Tutorial¶
Celonis Studio is a development platform, in which analysts and app creators can leverage data from the EMS to create various assets, such as analyses, knowledge models, or action flows. These assets are organized into packages, which, in turn, are organized into spaces. A space can contain multiple packages and a package can contain multiple assets. An asset can use data from different data models.
Once all desired assets are created inside a package in Celonis Studio, the package can be published with a certain version number. The assets inside the package are then available in Celonis Apps for end-users to interact with them.
Note:
Assets in Celonis Apps are read-only as its purpose is to provide Celonis assets for end-users to interact with them. If we want to create or modify assets, we have to use Celonis Studio.
1. Import PyCelonis and connect to Celonis API¶
from pycelonis import get_celonis
celonis = get_celonis()
[2023-03-24 15:54:34,938] INFO: No `base_url` given. Using environment variable 'CELONIS_URL' [2023-03-24 15:54:34,939] INFO: No `api_token` given. Using environment variable 'CELONIS_API_TOKEN'
[2023-03-24 15:54:35,031] WARNING: KeyType is not set. Defaulted to 'APP_KEY'.
[2023-03-24 15:54:35,034] INFO: Initial connect successful! PyCelonis Version: 2.0.3 [2023-03-24 15:54:35,039] INFO: `package-manager` permissions: ['$ACCESS_CHILD', 'EDIT_ALL_SPACES', 'MANAGE_PERMISSIONS', 'CREATE_SPACE', 'DELETE_ALL_SPACES'] [2023-03-24 15:54:35,040] INFO: `workflows` permissions: [] [2023-03-24 15:54:35,041] INFO: `task-mining` permissions: [] [2023-03-24 15:54:35,043] INFO: `action-engine` permissions: [] [2023-03-24 15:54:35,043] INFO: `team` permissions: [] [2023-03-24 15:54:35,044] INFO: `process-repository` permissions: [] [2023-03-24 15:54:35,044] INFO: `process-analytics` permissions: ['CREATE_WORKSPACE', 'MOVE_TO', 'DELETE_ALL_WORKSPACES', 'DELETE_ALL_ANALYSES', 'EDIT_ALL_ANALYSES', 'EDIT_ALL_WORKSPACES', 'USE_ALL_ANALYSES', 'CREATE_ANALYSES', 'MANAGE_PERMISSIONS', 'EXPORT_CONTENT'] [2023-03-24 15:54:35,045] INFO: `transformation-center` permissions: [] [2023-03-24 15:54:35,045] INFO: `storage-manager` permissions: ['DELETE', 'CREATE', 'GET', 'ADMIN', 'LIST'] [2023-03-24 15:54:35,046] INFO: `event-collection` permissions: ['USE_ALL_DATA_MODELS', '$ACCESS_CHILD', 'EDIT_ALL_DATA_POOLS', 'CREATE_DATA_POOL'] [2023-03-24 15:54:35,047] INFO: `user-provisioning` permissions: [] [2023-03-24 15:54:35,047] INFO: `ml-workbench` permissions: ['DELETE_SCHEDULERS', 'EDIT_SCHEDULERS', 'USE_ALL_SCHEDULERS', '$ACCESS_CHILD', 'USE_ALL_APPS', 'CREATE_SCHEDULERS', 'MANAGE_ALL_APPS', 'CREATE_WORKSPACES', 'MANAGE_SCHEDULERS_PERMISSIONS', 'VIEW_CONFIGURATION', 'CREATE_APPS', 'MANAGE_ALL_MLFLOWS', 'CREATE_MLFLOWS', 'USE_ALL_MLFLOWS', 'MANAGE_ALL_WORKSPACES']
2. Create a space¶
When we work on a new PyCelonis project and don't have any Celonis Studio assets inside our EMS, the first step will be to create a new space. Spaces are the main structural elements of the Studio service and are used to organize Studio assets. It is required to have a space in place before other Studio assets can be created.
space = celonis.studio.create_space("PyCelonis Tutorial Space")
space
[2023-03-24 15:54:35,062] INFO: Successfully created space with id '9cec5fe5-0280-4d87-872c-2b2cfeb7456c'
Space(id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c', name='PyCelonis Tutorial Space')
3. Add package to the space¶
Packages are used to bundle various Celonis assets, such as analyses, knowledge models, or action flows, into apps which can then be published to end-users via Celonis Apps. It is required to have a package in place before other Celonis assets can be created. A new package can be created by calling the create_package()
method from the parent-object space. The method takes as input arguments:
Name | Type | Description | Default |
---|---|---|---|
name |
str |
Displayable name of the package | Required |
key |
str |
Unique key which can be used to identify the package (Is optional and defaults to name when left out) | None |
package = space.create_package(name="PyCelonis Tutorial Package", key="pycelonis_tutorial_package")
package
[2023-03-24 15:54:35,078] INFO: Successfully created package with id '51e2355c-2592-47df-ad40-b0a5c9ab33bf'
Package(id='51e2355c-2592-47df-ad40-b0a5c9ab33bf', key='pycelonis_tutorial_package', name='PyCelonis Tutorial Package', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c')
We can verify that the package exists inside our space by calling the get_packages()
method:
space.get_packages()
[ Package(id='51e2355c-2592-47df-ad40-b0a5c9ab33bf', key='pycelonis_tutorial_package', name='PyCelonis Tutorial Package', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c') ]
4. Add variables to the package¶
Another important point to consider when working with Celonis Studio is to define package variables. These variables are used to store information on a package-level and can be referenced by various assets inside the package. It is a best-practice to use package variables instead of hardcoding the information into the assets, since we have a single source of truth and we can easily change information across multiple assets.
A new package variable can be created by calling the create_variable()
inside a package. The method takes the following input arguments:
Name | Type | Description | Default |
---|---|---|---|
key |
str |
Unique key to identify the variable | Required |
value |
str |
Value that should be written into the variable (Content depends on the type of created variable) | Required |
type_ |
str |
Type of created variable (Supported values: DATA_MODEL , PLAIN_TEXT ) |
Required |
4.1 Data Model Variables¶
Data model variables are used to reference a specific data model from various Celonis assets. When creating a new data model variable, type_
must be set to DATA_MODEL
and value
must be set to the data model ID.
In order to get the data model ID, we first navigate to the data model we want to use:
data_pool = celonis.data_integration.get_data_pools().find("PyCelonis Tutorial Data Pool")
data_model = data_pool.get_data_models().find("PyCelonis Tutorial Data Model")
data_model
DataModel(id='0caea823-104c-4555-9b58-678a727c62b2', name='PyCelonis Tutorial Data Model', pool_id='6c178afe-21e2-4f77-b862-e37653ae0b2e')
Now, we can create the data model variable and link the data model to it:
data_model_variable = package.create_variable(key="pycelonis_tutorial_data_model",
value=data_model.id,
type_="DATA_MODEL")
data_model_variable
[2023-03-24 15:54:35,154] INFO: Successfully created variable with key 'pycelonis_tutorial_data_model'
Variable(key='pycelonis_tutorial_data_model', type_='DATA_MODEL', value='0caea823-104c-4555-9b58-678a727c62b2')
4.2 Plain Text Variables¶
Plain text variables store simple text values, which can be used by various Celonis assets. When creating a new plain text variable, type_
must be set to PLAIN_TEXT
and value
can be an arbitrary String value:
text_variable = package.create_variable(key="pycelonis_tutorial_text",
value="PyCelonis Tutorial Text",
type_="PLAIN_TEXT")
text_variable
[2023-03-24 15:54:35,167] INFO: Successfully created variable with key 'pycelonis_tutorial_text'
Variable(key='pycelonis_tutorial_text', type_='PLAIN_TEXT', value='PyCelonis Tutorial Text')
4.3 Accessing package variables¶
To show all available variables inside a package, we can call the get_variables()
method:
package.get_variables()
[ Variable(key='pycelonis_tutorial_data_model', type_='DATA_MODEL', value='0caea823-104c-4555-9b58-678a727c62b2'), Variable(key='pycelonis_tutorial_text', type_='PLAIN_TEXT', value='PyCelonis Tutorial Text') ]
To access a specific variable, we can call the get_variable()
method and pass the variable key as input argument:
variable = package.get_variable(key="pycelonis_tutorial_text")
variable
Variable(key='pycelonis_tutorial_text', type_='PLAIN_TEXT', value='PyCelonis Tutorial Text')
Note:
Unlike other assets in Celonis Studio, variables do not have an ID but only a key, over which they can be accessed.
5. Add folder to the package¶
A new folder can be created with the create_folder()
method inside a package. The method takes as input arguments:
Name | Type | Description | Default |
---|---|---|---|
name |
str |
Displayable name of the folder | Required |
key |
str |
Unique key to identify the folder (Is optional and defaults to name when left out) | None |
folder = package.create_folder(name="PyCelonis Tutorial Folder", key="pycelonis_tutorial_folder")
folder
[2023-03-24 15:54:35,210] INFO: Successfully created folder with id '84a3edfb-a768-439f-8509-0823745bc9b8'
Folder(id='84a3edfb-a768-439f-8509-0823745bc9b8', key='pycelonis_tutorial_folder', name='PyCelonis Tutorial Folder', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c')
We can display all folders inside our package by calling the get_folders()
method:
package.get_folders()
[ Folder(id='84a3edfb-a768-439f-8509-0823745bc9b8', key='pycelonis_tutorial_folder', name='PyCelonis Tutorial Folder', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c') ]
6. Add Analysis to the package¶
A new analysis can be created by calling the create_analysis
method inside a package. The method takes the following input arguments:
Name | Type | Description | Default |
---|---|---|---|
name |
str |
Displayable name of the analysis | Required |
key |
str |
Unique key to identify the analysis (Is optional and defaults to name when left out) | None |
data_model_id |
str |
ID of the data model to which the analysis is linked. A best practice is to use a data model variable instead of hard-coding the ID into the analysis. A data model variable can be referenced in the format: ${{key}} |
None |
analysis = package.create_analysis(name="PyCelonis Tutorial Analysis",
key="pycelonis_tutorial_analysis",
data_model_id="${{pycelonis_tutorial_data_model}}")
analysis
[2023-03-24 15:54:35,246] INFO: Successfully created analysis with id 'd5c1ecdf-196f-4d7f-9014-a92c33e27dee'
Analysis(id='d5c1ecdf-196f-4d7f-9014-a92c33e27dee', key='pycelonis_tutorial_analysis', name='PyCelonis Tutorial Analysis', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c')
We can display all analyses inside our package by calling the get_analyses()
method:
package.get_analyses()
[ Analysis(id='d5c1ecdf-196f-4d7f-9014-a92c33e27dee', key='pycelonis_tutorial_analysis', name='PyCelonis Tutorial Analysis', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c') ]
7. Add knowledge model to the package¶
A new knowledge model can be created by calling the create_knowledge_model()
method inside a package and passing the content of the knowledge model as input argument. The content of the knowledge model is defined as dictionary. PyCelonis then takes this dictionary and creates a YAML file out of it, which is used by the EMS to create the knowledge model.
Here, we simply create an empty knowledge model with some metadata that references our data model variable. However, it is also possible to define additional content that should be part of the knowledge model, such as records, KPIs, and filters.
content = {
"kind" : "BASE",
"metadata" : {"key":"pycelonis_tutorial_km", "displayName":"PyCelonis Tutorial Knowledge Model"},
"dataModelId" : "${{pycelonis_tutorial_data_model}}"
}
knowledge_model = package.create_knowledge_model(content)
knowledge_model
[2023-03-24 15:54:35,286] INFO: Successfully created knowledge model with id 'f303217b-d45d-49d7-a9cc-02aab08bea0b'
KnowledgeModel(id='f303217b-d45d-49d7-a9cc-02aab08bea0b', key='pycelonis_tutorial_km', name='PyCelonis Tutorial Knowledge Model', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c')
We can display all knowledge models inside our package by calling the get_knowledge_models()
method:
package.get_knowledge_models()
[ KnowledgeModel(id='f303217b-d45d-49d7-a9cc-02aab08bea0b', key='pycelonis_tutorial_km', name='PyCelonis Tutorial Knowledge Model', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c') ]
We can access the current YAML configuration of a knowledge model via its serialized_content
property:
print(knowledge_model.serialized_content)
kind: BASE metadata: key: pycelonis_tutorial_km displayName: PyCelonis Tutorial Knowledge Model dataModelId: ${{pycelonis_tutorial_data_model}}
8. Export data from knowledge model PQL¶
We can query data using any KPI or variables from the knowledge model with the resolve_query
method. Therefore, we need to create a PQL query.
The first option is to create a custom query.
from pycelonis.pql import PQL, PQLColumn
custom_query = PQL() + PQLColumn(name="KPI", query='KPI("COUNT_TABLE__ACTIVITIES")')
The second option is to get the query from a knowledge model component. Right now, this functionality is supported for knowledge model filters, record attributes and identifiers.
To get the query of a knowledge model filter, we first have to find the filter in the knowledge model.
km_filter = knowledge_model.get_content().filters.find_by_id("test_filter")
Then, we use the get_filter
method which will return the PQlFilter
of the knowledge model filter.
filter_query = km_filter.get_filter()
filter_query
PQLFilter(query='FILTER "ACTIVITIES"."ACTIVITY_EN" IS NOT NULL')
Similarly, we get the query of a knowledge model record attribute. First, we locate the record.
km_record = knowledge_model.get_content().records.find_by_id('ACTIVITIES')
km_attribute = km_record.attributes.find_by_id('ACTIVITY_EN')
The get_column
method will then return the PQLColumn
of the attribute.
attribute_query = km_attribute.get_column()
attribute_query
PQLColumn(name='ACTIVITY_EN', query='"ACTIVITIES"."ACTIVITY_EN"')
We can then combine our custom query with the filter and attribute from the knowledge model.
full_query = custom_query + attribute_query + filter_query
full_query
PQL(columns=[ PQLColumn(name='KPI', query='KPI("COUNT_TABLE__ACTIVITIES")'), PQLColumn(name='ACTIVITY_EN', query='"ACTIVITIES"."ACTIVITY_EN"') ], filters=[ PQLFilter(query='FILTER "ACTIVITIES"."ACTIVITY_EN" IS NOT NULL') ], order_by_columns=[], distinct=False, limit=None, offset=None)
Lastly, we resolve all knowledge model variables and KPIs of this query.
data_query, query_environment = knowledge_model.resolve_query(full_query)
The returned DataQuery
and QueryEnvironment
can then be used to query data via the data model function export_data_frame()
which returns a dataframe with the results.
df = data_model.export_data_frame(data_query, query_environment)
df
[2023-03-24 15:54:35,552] INFO: Successfully created data export with id '2f776797-e1b5-48f9-a9a0-1c12421983e5' [2023-03-24 15:54:35,553] INFO: Wait for execution of data export with id '2f776797-e1b5-48f9-a9a0-1c12421983e5'
[2023-03-24 15:54:35,578] INFO: Export result chunks for data export with id '2f776797-e1b5-48f9-a9a0-1c12421983e5'
KPI | ACTIVITY_EN | |
---|---|---|
0 | 10 | Book Invoice |
1 | 10 | Create Purchase Order Item |
2 | 10 | Create Purchase Requisition Item |
3 | 10 | Print and Send Purchase Order |
4 | 10 | Receive Goods |
5 | 10 | Scan Invoice |
9. Publish a package¶
Once we are finished with building assets, we can publish the package to make them available to end-users via Celonis Apps. A package can be published by calling the publish()
method inside a package. The method takes the following input arguments:
Name | Type | Description | Default |
---|---|---|---|
version |
str |
Version of the published app (Format: Major .Minor .Patch , Example: 1.0.4 ) |
None |
node_ids_to_exclude |
List[str] |
List of asset IDs, which should not be published in Celonis Apps | None |
Let's publish our newly created package as version 1.0.0 and exclude the folder, since it is currently empty:
package.publish(version="1.0.0", node_ids_to_exclude=[folder.id])
[2023-03-24 15:54:35,634] INFO: Successfully published package with id '51e2355c-2592-47df-ad40-b0a5c9ab33bf' with version '1.0.0'
10. Access assets in Celonis Apps¶
Once assets in a package have been published, they can be accessed via Celonis Apps by end-users. PyCelonis also supports accessing published assets in Celonis Apps. However, since Celonis Apps is read-only, PyCelonis only supports get_<asset>()
methods but no create_<asset>()
methods.
Let's start by accessing the space, in which our assets have been created by calling get_space()
inside celonis.apps
and passing the space ID as input parameter.
Note:
All IDs for assets inside Celonis Apps are exactly the same as in Celonis Studio.
app_space = celonis.apps.get_space(space.id)
app_space
PublishedSpace(id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c', name='PyCelonis Tutorial Space')
Assets inside Celonis Apps can be recognized by the prefix Published<asset>
. We can now access the package, in which our assets are located by calling the get_package()
method inside our Celonis Apps space and passing the package ID as argument:
app_package = app_space.get_package(package.id)
app_package
PublishedPackage(id='51e2355c-2592-47df-ad40-b0a5c9ab33bf', key='pycelonis_tutorial_package', name='PyCelonis Tutorial Package', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c')
From this package, we can access published Celonis assets via the get_<asset>s()
method for all assets of a certain type or the get_<asset>("id")
method for a specific asset:
app_package.get_analyses()
[2023-03-24 15:54:35,672] INFO: `get_analyses` returns analyses without content. To fetch the content for a specific analysis call`analysis.sync()` or use `package.get_analysis(analysis_id)`
[ PublishedAnalysis(id='d5c1ecdf-196f-4d7f-9014-a92c33e27dee', key='pycelonis_tutorial_analysis', name='PyCelonis Tutorial Analysis', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c') ]
published_analysis = app_package.get_analysis(analysis.id)
10.1 Export data from analysis component PQL¶
Similar to knowledge models, we can query data using any KPI or variables from the analysis with the resolve_query
method. First, we need to create the PQL query.
custom_query = PQL() + PQLColumn(name="_CASE_KEY", query='"ACTIVITIES"."_CASE_KEY"')
We can also get the query of an analysis component. Therefore, we first have to specify the analysis sheet of the component.
published_sheet = published_analysis.get_content().draft.document.sheets[0]
Then, we specify the analysis component itself. In this example, we find a OLAP table on the analysis sheet.
olap_table = published_sheet.components.find("#{OLAP Table}", search_attribute="title")
The get_query
method returns the query of the OLAP table.
olap_query = olap_table.get_query()
olap_query
PQL(columns=[ PQLColumn(name='ACTIVITY_EN', query='"ACTIVITIES"."ACTIVITY_EN"'), PQLColumn(name='Count Table', query='COUNT_TABLE("ACTIVITIES")') ], filters=[ PQLFilter(query='FILTER "ACTIVITIES"."ACTIVITY_EN" = \'ACTIVITY1\';\nFILTER "ACTIVITIES"."ACTIVITY_EN" != \'ACTIVITY2\';') ], order_by_columns=[ OrderByColumn(query='"ACTIVITIES"."ACTIVITY_EN"', ascending=True) ], distinct=False, limit=None, offset=None)
Optionally, we can combine the custom query with any element of the component query. Here, we combine the custom query with the query of the OLAP table column.
full_query = custom_query + olap_query.columns[0]
full_query
PQL(columns=[ PQLColumn(name='_CASE_KEY', query='"ACTIVITIES"."_CASE_KEY"'), PQLColumn(name='ACTIVITY_EN', query='"ACTIVITIES"."ACTIVITY_EN"') ], filters=[], order_by_columns=[], distinct=False, limit=None, offset=None)
We resolve all analysis variables and KPIs of this query.
data_query, query_environment = published_analysis.resolve_query(full_query)
The returned DataQuery
and QueryEnvironment
can then be used to query data via the data model function export_data_frame()
which will return a dataframe with the results.
df = data_model.export_data_frame(data_query, query_environment)
df.head(n=3)
[2023-03-24 15:54:35,823] INFO: Successfully created data export with id 'd63dc6ce-2499-4562-957f-affbbab9dd68' [2023-03-24 15:54:35,824] INFO: Wait for execution of data export with id 'd63dc6ce-2499-4562-957f-affbbab9dd68'
[2023-03-24 15:54:35,844] INFO: Export result chunks for data export with id 'd63dc6ce-2499-4562-957f-affbbab9dd68'
_CASE_KEY | ACTIVITY_EN | |
---|---|---|
0 | 800000000006800001 | Create Purchase Requisition Item |
1 | 800000000006800001 | Create Purchase Order Item |
2 | 800000000006800001 | Print and Send Purchase Order |
11. Updating packages and versioning¶
When making changes to assets inside Celonis Studio, we have to publish the package again, as otherwise the changes will not be available in Celonis Apps. This can be easily done by calling the publish()
method each time we want to publish an update and passing a new version number in the format MAJOR.MINOR.PATCH
as input argument.
Note:
The version number can also be left out, in which case the version will simply increment by one PATCH
number (e.g. 1.0.4 -> 1.0.5
).
Let's update our app by deleting the analysis from our package:
analysis.delete()
[2023-03-24 15:54:35,891] INFO: Successfully deleted content node with id 'd5c1ecdf-196f-4d7f-9014-a92c33e27dee'
By calling get_analyses()
from our package, we can see that the analysis has been successfully removed:
package.get_analyses()
[]
However, when calling the same method from the app-package, the analysis still exists. This is because we have not published our changes yet:
app_package.get_analyses()
[2023-03-24 15:54:35,921] INFO: `get_analyses` returns analyses without content. To fetch the content for a specific analysis call`analysis.sync()` or use `package.get_analysis(analysis_id)`
[ PublishedAnalysis(id='d5c1ecdf-196f-4d7f-9014-a92c33e27dee', key='pycelonis_tutorial_analysis', name='PyCelonis Tutorial Analysis', root_node_key='pycelonis_tutorial_package', space_id='9cec5fe5-0280-4d87-872c-2b2cfeb7456c') ]
To do this, we simply call the publish()
method another time and specify a new version number for our package:
package.publish(version="2.0.0")
[2023-03-24 15:54:35,953] INFO: Successfully published package with id '51e2355c-2592-47df-ad40-b0a5c9ab33bf' with version '2.0.0'
By calling the get_analyses()
again from the app-package, we can see that the changes have been published in Celonis Apps:
app_package.get_analyses()
[2023-03-24 15:54:35,965] INFO: `get_analyses` returns analyses without content. To fetch the content for a specific analysis call`analysis.sync()` or use `package.get_analysis(analysis_id)`
[]
Conclusion¶
Congratulations! You have learned how to perform basic interactions with Celonis Studio, such as creating new assets, publishing a package, and pushing updates into Celonis Apps. This is the end of the PyCelonis tutorials. You should now be able to write Python scripts that leverage various assets of the EMS, which can be used for performing analyses and running machine learning algorithms.