Pulling Data from a Knowledge Model¶
In this tutorial, you will learn how to extract data using a knowledge model. More specifically, you will learn:
- How to connect to the EMS
- How to get and interact with a knowledge model and its objects
- How to create a PQL query using knowledge model attributes
- How to extract data from the data model
Prerequisites¶
To follow this tutorial, you should have PyCelonis installed and should know how to perform basic interactions with PyCelonis objects. If you don't know how to do this, please complete the Celonis Basics tutorial first. Further, it would be helpful to already have the previously mentioned assets inside your EMS. Please refer to the Studio - Introduction and Data Integration - Data Export tutorials for an overview of working with each asset type.
1. Import PyCelonis and connect to Celonis API¶
from pycelonis.pql import PQL, PQLColumn, PQLFilter
from pycelonis import get_celonis
celonis = get_celonis()
[2023-02-07 22:14:16,478] INFO: Initial connect successful! PyCelonis Version: 2.0.1
2. Get the knowledge model, record, and filter objects¶
space = celonis.studio.get_spaces().find('Space_Name_Here')
package = space.get_packages().find('Package_Name_Here')
knowledge_model = package.get_knowledge_model('Knowledge_Model_Name_Here')
content = knowledge_model.get_content()
data_model_id = content.data_model_id
record_name = 'Record_Name_Here'
records = content.records
record = records.find(search_term= record_name, search_attribute='display_name')
filter_name = 'Filter_Name_Here'
filters = content.filters
filter = filters.find(search_term=filter_name, search_attribute='display_name')
3. Create a PQL query using knowledge model attributes and filter¶
query = PQL()
attribute_names = ['Attribute Name 1 Here', 'Attribute Name 2 Here', 'Attribute Name 3', 'Attribute Name 4']
attributes = record.attributes
for name in attribute_names:
attribute = attributes.find(name,search_attribute='display_name')
query += PQLColumn(name=attribute.display_name, query=attribute.pql)
query += PQLFilter(name=filter.display_name, query=filter.pql)
query
PQL(columns=[ PQLColumn(name='Client ID', query='"BSEG"."MANDT"'), PQLColumn(name='Doc. Number', query='"BSEG"."BELNR"'), PQLColumn(name='Company Code', query='"BSEG"."BUKRS"'), PQLColumn(name='Vendor Number', query='"LFA1"."LIFNR"') ], filters=[ PQLFilter(query='FILTER LEN("BKPF"."XBLNR") > 4;') ], order_by_columns=[], distinct=False, limit=None, offset=None)
4. Get the data pool and data model objects¶
Option #1: Use the data pool name¶
data_pool = celonis.data_integration.get_data_pools().find('Data_Pool_Name_Here')
data_model = data_pool.get_data_model(data_model_id)
Option #2: Use the data pool id (for Optimization)¶
# Use the data pool id (found in the data pool URL) to optimize your script
data_pool = celonis.data_integration.get_data_pool('Data_Pool_ID_Here')
data_model = data_pool.get_data_model(data_model_id)
5. Query the data model¶
import pycelonis.pql as pql
from pycelonis.pql.saola_connector import KnowledgeModelSaolaConnector
df = pql.DataFrame.from_pql(query, saola_connector=KnowledgeModelSaolaConnector(data_model, knowledge_model))
df.head()
[2023-02-07 22:14:52,256] INFO: Successfully created data export with id 'bc4d526c-b0f0-4eac-89bd-3b26db5157d1' [2023-02-07 22:14:52,258] INFO: Wait for execution of data export with id 'bc4d526c-b0f0-4eac-89bd-3b26db5157d1'
0it [00:00, ?it/s]
[2023-02-07 22:14:52,323] INFO: Export result chunks for data export with id 'bc4d526c-b0f0-4eac-89bd-3b26db5157d1'
Client ID | Doc. Number | Company Code | Vendor Number | |
---|---|---|---|---|
0 | 800_ | 0000109290_ | 3000_ | 0000001005_ |
1 | 800_ | 0000111453_ | 3000_ | COMUNASA_ |
2 | 800_ | 0000111529_ | 3000_ | 0000100058_ |
3 | 800_ | 0000111588_ | 3000_ | 0000200026_ |
4 | 800_ | 0000111600_ | 3000_ | 0000200026_ |
... | ... | ... | ... | ... |
3767 | 800_ | 0000075924_ | 3000_ | 0000001075_ |
3768 | 800_ | 0000075930_ | 3000_ | 0000003511_ |
3769 | 800_ | 0000076079_ | 3000_ | 0000001200_ |
3770 | 800_ | 0000076297_ | 3000_ | 0000001000_ |
3771 | 800_ | 0000000797_ | 3000_ | 0000001000_ |
3772 rows × 4 columns
Conclusion¶
Congratulations! You have learned how to connect to the EMS, get and interact with a knowledge model and its objects, create a PQL query using knowledge model attributes, and extract the data.