Data Upload¶
In this tutorial, you will learn how to upload existing data from your local Python project into the EMS. More specifically, you will learn:
- How to create a new table in the EMS with your data
- How to append and upsert your data to existing tables in the EMS
- How to add a table from your data pool to your data model
- How to reload a data model
Prerequisites¶
To follow this tutorial, you should have already created a data pool and a data model inside your pool. If you haven't done this yet, please complete the Data Integration - Introduction tutorial first.
Tutorial¶
1. Import PyCelonis and connect to Celonis API¶
from pycelonis import get_celonis
celonis = get_celonis(permissions=False)
[2024-08-09 08:57:41,707] INFO: No `base_url` given. Using environment variable 'CELONIS_URL'
[2024-08-09 08:57:41,708] INFO: No `api_token` given. Using environment variable 'CELONIS_API_TOKEN'
[2024-08-09 08:57:41,745] WARNING: KeyType is not set. Defaulted to 'APP_KEY'.
[2024-08-09 08:57:41,747] INFO: Initial connect successful! PyCelonis Version: 2.10.1
No access key provided. Please set the CELONIS_MLOPS_ACCESS_KEY environment variable.
2. Select data pool to upload data into¶
Before we can upload data into the EMS, we first have to select a data pool, into which the data should be uploaded. Here, we use the data pool, which we created in the Data Integration - Introduction tutorial:
data_pool = celonis.data_integration.get_data_pools().find("PyCelonis Tutorial Data Pool")
data_pool
DataPool(id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a', name='PyCelonis Tutorial Data Pool')
With the get_tables()
method, we can verify that we currently don't have any tables in the data pool:
data_pool.get_tables()
[]
3. Upload data into the EMS¶
Data can be uploaded into Celonis in two formats: as Pandas dataframes or as Parquet files. In this tutorial, we will focus on pushing data as Pandas dataframes. If you want to push data as Parquet files, please refer to the Data Push & Export Advanced tutorial.
In this tutorial, we will use a sample dataset for the SAP Purchase-to-Pay (P2P) process, which depicts the process of procuring materials from vendors. Below, is an overview of the most important tables for this process:
Table | Description |
---|---|
_CEL_P2P_ACTIVITIES_EN | Activity Table |
EKPO | Purchasing Document Item (i.e. Case Table) |
EKKO | Purchasing Document Header |
LFA1 | Vendor Master Data |
Let's start by importing the tables of this dataset as Pandas dataframes:
import pandas as pd
activity_df = pd.read_parquet("../../../assets/_CEL_P2P_ACTIVITIES_EN.parquet", engine="pyarrow")
print(activity_df.shape)
activity_df.head()
(60, 13)
_CASE_KEY | ACTIVITY_EN | ACTIVITY_DE | EVENTTIME | _SORTING | USER_TYPE | CHANGED_TABLE | CHANGED_FIELD | CHANGED_FROM | CHANGED_TO | CHANGED_FROM_FLOAT | CHANGED_TO_FLOAT | CHANGE_NUMBER | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 800000000006800001 | Create Purchase Requisition Item | Lege BANF Position an | 2008-12-31 07:44:05 | 0.0 | B | None | None | None | None | NaN | NaN | None |
1 | 800000000006800001 | Create Purchase Order Item | Lege Bestellposition an | 2009-01-02 07:44:05 | 10.0 | B | None | None | None | None | NaN | NaN | None |
2 | 800000000006800001 | Print and Send Purchase Order | Sende Bestellung | 2009-01-05 07:44:05 | NaN | B | None | None | None | None | NaN | NaN | None |
3 | 800000000006800001 | Receive Goods | Wareneingang | 2009-01-12 07:44:05 | 30.0 | A | None | None | None | None | NaN | NaN | None |
4 | 800000000006800001 | Scan Invoice | Scanne Rechnung | 2009-01-20 07:44:05 | NaN | A | None | None | None | None | NaN | NaN | None |
item_df = pd.read_parquet("../../../assets/EKPO.parquet", engine="pyarrow")
print(item_df.shape)
item_df.head()
(10, 34)
_CASE_KEY | MANDT | LOEKZ | STATU | AEDAT | MATNR | BUKRS | WERKS | LGORT | MATKL | ... | AUDAT | Material Text (MAKT_MAKTX) | Company Code Text (EKPO_BUKRS) | Plant Text (EKPO_WERKS) | Stor Location Text (EKPO_LGORT) | EBELN | EBELP | Item Category Text(EKPO_PSTYP) | Material Group Text (MATKL_TEXT) | Net Value(NETWR_EUR) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 800000000006800001 | 800 | None | None | 2009-01-02 | WL-1000 | 3000 | 3200 | 0001 | 001 | ... | NaT | Shafting assembly | IDES US INC | Atlanta | Warehouse 0001 | 0000000068 | 00001 | Standard | Metal processing | 3.48000 |
1 | 800000000006800002 | 800 | None | None | 2009-01-02 | None | 3000 | 3200 | 0001 | 00202 | ... | NaT | IDES US INC | Atlanta | Warehouse 0001 | 0000000068 | 00002 | Standard | Motherboards | 3.46260 | |
2 | 800000000006800003 | 800 | None | None | 2009-01-02 | DG-1000 | 3000 | 3200 | 0001 | 001 | ... | NaT | Rubber Seal | IDES US INC | Atlanta | Warehouse 0001 | 0000000068 | 00003 | Standard | Metal processing | 0.29000 |
3 | 800000000006800004 | 800 | None | None | 2009-01-02 | I-1100 | 3000 | 3200 | 0001 | 007 | ... | NaT | Pump Installation | IDES US INC | Atlanta | Warehouse 0001 | 0000000068 | 00004 | Standard | Services | 2.90000 |
4 | 800000000006800005 | 800 | None | None | 2009-01-02 | None | 3000 | 3200 | 0001 | 001 | ... | NaT | IDES US INC | Atlanta | Warehouse 0001 | 0000000068 | 00005 | Standard | Metal processing | 0.27608 |
5 rows × 34 columns
header_df = pd.read_parquet("../../../assets/EKKO.parquet", engine="pyarrow")
print(header_df.shape)
header_df.head()
(2, 33)
MANDT | BUKRS | BSTYP | BSART | LOEKZ | STATU | AEDAT | ERNAM | LIFNR | ZTERM | ... | FRGZU | Document Category Text (EKKO_BSTYP) | RFQ status Text(EKKO_STATU) | Document Type Text (EKKO_BSART) | Purchasing Organization Text (EKKO_EKORG) | Company Name (EKKO_BUTXT) | Country Key (EKKO_LAND1) | Currency Key (EKKO_WAERS) | EBELN | Company Code Text (EKKO_BUKRS) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
14312 | 800 | 3000 | F | EC | None | I | 2009-01-02 | PURCHMANAGER | 0000003701 | NT30 | ... | None | Purchase order | None | Electronic commerce | IDES Deutschland | IDES US INC | US | USD | 0000000068 | IDES US INC |
14338 | 800 | 3000 | F | EC | None | I | 2009-02-03 | MILLERJ | 0000003701 | NT30 | ... | None | Purchase order | None | Electronic commerce | IDES Deutschland | IDES US INC | US | USD | 0000000069 | IDES US INC |
2 rows × 33 columns
master_df = pd.read_parquet("../../../assets/LFA1.parquet", engine="pyarrow")
print(master_df.shape)
master_df.head()
(1, 22)
MANDT | LIFNR | LAND1 | NAME1 | ORT01 | PFACH | PSTL2 | PSTLZ | REGIO | SORTL | ... | BEGRU | ERDAT | ERNAM | SPRAS | TELBX | TELF1 | TELF2 | TELFX | TELTX | TELX1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
263 | 800 | 0000003701 | US | eSupplier, Inc | WILMINGTON | None | None | 19801 | DE | EBP | ... | None | 2001-12-07 | STANKOVICH | E | None | 302-656-0196 | None | 302-656-0001 | None | None |
1 rows × 22 columns
After having the dataframes in place, we can upload them into the EMS by either creating new tables in the data pool or by appending/upserting the data into existing tables of the data pool.
3.1 Create new table in the EMS¶
New tables can be created in the EMS with the create_table()
method. The method takes as input arguments:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame |
A pandas dataframe containing the data | Required |
table_name |
str |
Name that the table in the data pool should have | Required |
drop_if_exists |
bool |
Specifies how to handle situations when a table with the same name already exists in the data pool (True = replace existing table, False = raise error and keep existing table) | False |
Further, we can give a column_config
as input argument, which specifies the names and column types of our table. Specifying a custom column_config
is especially important if we use tables with longer text values, as by default, strings are cut off after 80 characters during the data upload. For a guide on how to specify a custom column_config
, refer to the Data Push & Export Advanced tutorial.
data_pool.create_table(df=activity_df, table_name="ACTIVITIES", drop_if_exists=False)
data_pool.create_table(df=item_df, table_name="EKPO", drop_if_exists=False)
data_pool.create_table(df=header_df, table_name="EKKO", drop_if_exists=False)
data_pool.create_table(df=master_df, table_name="LFA1", drop_if_exists=False)
[2024-08-09 08:57:41,897] WARNING: STRING columns are by default stored as VARCHAR(80) and therefore cut after 80 characters. You can specify a custom field length for each column using the `column_config` parameter.
[2024-08-09 08:57:41,901] INFO: Successfully created data push job with id '1bb8e47e-35aa-4ad4-866d-d115f1d092ed'
[2024-08-09 08:57:41,902] INFO: Add data frame as file chunks to data push job with id '1bb8e47e-35aa-4ad4-866d-d115f1d092ed'
[2024-08-09 08:57:41,912] INFO: Successfully upserted file chunk to data push job with id '1bb8e47e-35aa-4ad4-866d-d115f1d092ed'
[2024-08-09 08:57:41,917] INFO: Successfully triggered execution for data push job with id '1bb8e47e-35aa-4ad4-866d-d115f1d092ed'
[2024-08-09 08:57:41,918] INFO: Wait for execution of data push job with id '1bb8e47e-35aa-4ad4-866d-d115f1d092ed'
[2024-08-09 08:57:41,942] INFO: Successfully created table 'ACTIVITIES' in data pool
[2024-08-09 08:57:41,945] INFO: Successfully deleted data push job with id '1bb8e47e-35aa-4ad4-866d-d115f1d092ed'
[2024-08-09 08:57:41,953] WARNING: STRING columns are by default stored as VARCHAR(80) and therefore cut after 80 characters. You can specify a custom field length for each column using the `column_config` parameter.
[2024-08-09 08:57:41,957] INFO: Successfully created data push job with id '7d8f774e-4270-4275-8df8-761cc0a78b62'
[2024-08-09 08:57:41,958] INFO: Add data frame as file chunks to data push job with id '7d8f774e-4270-4275-8df8-761cc0a78b62'
[2024-08-09 08:57:41,973] INFO: Successfully upserted file chunk to data push job with id '7d8f774e-4270-4275-8df8-761cc0a78b62'
[2024-08-09 08:57:41,981] INFO: Successfully triggered execution for data push job with id '7d8f774e-4270-4275-8df8-761cc0a78b62'
[2024-08-09 08:57:41,981] INFO: Wait for execution of data push job with id '7d8f774e-4270-4275-8df8-761cc0a78b62'
[2024-08-09 08:57:42,016] INFO: Successfully created table 'EKPO' in data pool
[2024-08-09 08:57:42,021] INFO: Successfully deleted data push job with id '7d8f774e-4270-4275-8df8-761cc0a78b62'
[2024-08-09 08:57:42,032] WARNING: STRING columns are by default stored as VARCHAR(80) and therefore cut after 80 characters. You can specify a custom field length for each column using the `column_config` parameter.
[2024-08-09 08:57:42,037] INFO: Successfully created data push job with id 'a6a69c38-63e5-4e74-8f3e-04a7afc895fe'
[2024-08-09 08:57:42,038] INFO: Add data frame as file chunks to data push job with id 'a6a69c38-63e5-4e74-8f3e-04a7afc895fe'
[2024-08-09 08:57:42,055] INFO: Successfully upserted file chunk to data push job with id 'a6a69c38-63e5-4e74-8f3e-04a7afc895fe'
[2024-08-09 08:57:42,066] INFO: Successfully triggered execution for data push job with id 'a6a69c38-63e5-4e74-8f3e-04a7afc895fe'
[2024-08-09 08:57:42,067] INFO: Wait for execution of data push job with id 'a6a69c38-63e5-4e74-8f3e-04a7afc895fe'
[2024-08-09 08:57:42,107] INFO: Successfully created table 'EKKO' in data pool
[2024-08-09 08:57:42,113] INFO: Successfully deleted data push job with id 'a6a69c38-63e5-4e74-8f3e-04a7afc895fe'
[2024-08-09 08:57:42,126] WARNING: STRING columns are by default stored as VARCHAR(80) and therefore cut after 80 characters. You can specify a custom field length for each column using the `column_config` parameter.
[2024-08-09 08:57:42,134] INFO: Successfully created data push job with id '3a521e18-2acf-44d4-b3d9-782f69213688'
[2024-08-09 08:57:42,135] INFO: Add data frame as file chunks to data push job with id '3a521e18-2acf-44d4-b3d9-782f69213688'
[2024-08-09 08:57:42,150] INFO: Successfully upserted file chunk to data push job with id '3a521e18-2acf-44d4-b3d9-782f69213688'
[2024-08-09 08:57:42,164] INFO: Successfully triggered execution for data push job with id '3a521e18-2acf-44d4-b3d9-782f69213688'
[2024-08-09 08:57:42,164] INFO: Wait for execution of data push job with id '3a521e18-2acf-44d4-b3d9-782f69213688'
[2024-08-09 08:57:42,213] INFO: Successfully created table 'LFA1' in data pool
[2024-08-09 08:57:42,220] INFO: Successfully deleted data push job with id '3a521e18-2acf-44d4-b3d9-782f69213688'
DataPoolTable(name='LFA1', data_source_id=None, columns=[], schema_name='1bcc67dc-935a-4b7f-940a-9c0b81a7135a', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a')
We can verify that the newly-created tables exist in our data pool by calling the get_tables()
method:
data_pool.get_tables()
[ DataPoolTable(name='ACTIVITIES', data_source_id=None, columns=[], schema_name='1bcc67dc-935a-4b7f-940a-9c0b81a7135a', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a'), DataPoolTable(name='EKKO', data_source_id=None, columns=[], schema_name='1bcc67dc-935a-4b7f-940a-9c0b81a7135a', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a'), DataPoolTable(name='EKPO', data_source_id=None, columns=[], schema_name='1bcc67dc-935a-4b7f-940a-9c0b81a7135a', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a'), DataPoolTable(name='LFA1', data_source_id=None, columns=[], schema_name='1bcc67dc-935a-4b7f-940a-9c0b81a7135a', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a') ]
3.2 Append data to existing table in the EMS¶
We can also choose to append our data to an already existing table in our data pool with the append()
method. The method takes as input arguments:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame |
A pandas dataframe containing the data | Required |
Important:
The column types and names of our dataframe must be the same as in the target table in the data pool, otherwise the append operation will fail.
Let's create a new activity table ACTIVITIES_APPEND
in our data pool, to which we want to append a new dataframe:
data_pool_table = data_pool.create_table(df=activity_df, table_name="ACTIVITIES_APPEND", drop_if_exists=False)
[2024-08-09 08:57:42,258] WARNING: STRING columns are by default stored as VARCHAR(80) and therefore cut after 80 characters. You can specify a custom field length for each column using the `column_config` parameter.
[2024-08-09 08:57:42,268] INFO: Successfully created data push job with id '57bef72d-2dc3-48b6-9c14-9bc53a7f521e'
[2024-08-09 08:57:42,268] INFO: Add data frame as file chunks to data push job with id '57bef72d-2dc3-48b6-9c14-9bc53a7f521e'
[2024-08-09 08:57:42,281] INFO: Successfully upserted file chunk to data push job with id '57bef72d-2dc3-48b6-9c14-9bc53a7f521e'
[2024-08-09 08:57:42,298] INFO: Successfully triggered execution for data push job with id '57bef72d-2dc3-48b6-9c14-9bc53a7f521e'
[2024-08-09 08:57:42,298] INFO: Wait for execution of data push job with id '57bef72d-2dc3-48b6-9c14-9bc53a7f521e'
[2024-08-09 08:57:42,373] INFO: Successfully created table 'ACTIVITIES_APPEND' in data pool
[2024-08-09 08:57:42,389] INFO: Successfully deleted data push job with id '57bef72d-2dc3-48b6-9c14-9bc53a7f521e'
We can now append another dataframe to the already existing table by calling the append()
method:
data_pool_table.append(activity_df)
[2024-08-09 08:57:42,416] WARNING: No column configuration set. String columns are cropped to 80 characters if not configured
[2024-08-09 08:57:42,432] INFO: Successfully created data push job with id 'afafd476-9202-4122-b57f-79e9fcdddf95'
[2024-08-09 08:57:42,435] INFO: Add data frame as file chunks to data push job with id 'afafd476-9202-4122-b57f-79e9fcdddf95'
[2024-08-09 08:57:42,457] INFO: Successfully upserted file chunk to data push job with id 'afafd476-9202-4122-b57f-79e9fcdddf95'
[2024-08-09 08:57:42,493] INFO: Successfully triggered execution for data push job with id 'afafd476-9202-4122-b57f-79e9fcdddf95'
[2024-08-09 08:57:42,494] INFO: Wait for execution of data push job with id 'afafd476-9202-4122-b57f-79e9fcdddf95'
[2024-08-09 08:57:42,579] INFO: Successfully deleted data push job with id 'afafd476-9202-4122-b57f-79e9fcdddf95'
[2024-08-09 08:57:42,580] INFO: Successfully appended rows to table 'ACTIVITIES_APPEND' in data pool
3.3 Upsert data to existing table in the EMS¶
Lastly, we can choose to upsert our data into an already existing data pool table with the upsert()
method. Upsert works similar to the append operation (i.e. it adds rows from a dataframe into a table) but replaces rows if they already exist. For this, we have to specify in keys
a list of column names according to which to check for equality. If two rows have the same values in all columns specified in keys
, they are marked as duplicates and replaced.
The upsert()
method takes the following input arguments:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame |
A pandas dataframe containing the data | Required |
keys |
List[str] |
List of column names according to which to check for equality | Required |
Let's create a new activity table ACTIVITIES_UPSERT
in our data pool, to which we want to upsert a new dataframe:
data_pool_table = data_pool.create_table(df=activity_df, table_name="ACTIVITIES_UPSERT", drop_if_exists=False)
[2024-08-09 08:57:42,601] WARNING: STRING columns are by default stored as VARCHAR(80) and therefore cut after 80 characters. You can specify a custom field length for each column using the `column_config` parameter.
[2024-08-09 08:57:42,614] INFO: Successfully created data push job with id '947f4122-4d8e-46ba-a034-d4a9fc47f1f9'
[2024-08-09 08:57:42,615] INFO: Add data frame as file chunks to data push job with id '947f4122-4d8e-46ba-a034-d4a9fc47f1f9'
[2024-08-09 08:57:42,632] INFO: Successfully upserted file chunk to data push job with id '947f4122-4d8e-46ba-a034-d4a9fc47f1f9'
[2024-08-09 08:57:42,654] INFO: Successfully triggered execution for data push job with id '947f4122-4d8e-46ba-a034-d4a9fc47f1f9'
[2024-08-09 08:57:42,655] INFO: Wait for execution of data push job with id '947f4122-4d8e-46ba-a034-d4a9fc47f1f9'
[2024-08-09 08:57:42,732] INFO: Successfully created table 'ACTIVITIES_UPSERT' in data pool
[2024-08-09 08:57:42,745] INFO: Successfully deleted data push job with id '947f4122-4d8e-46ba-a034-d4a9fc47f1f9'
We can now upsert another dataframe by calling the upsert()
method. Here, we specify _CASE_KEY
and ACTIVITY_EN
as the columns according to which to check for equality:
data_pool_table.upsert(activity_df, keys=["_CASE_KEY", "ACTIVITY_EN"])
[2024-08-09 08:57:42,763] WARNING: No column configuration set. String columns are cropped to 80 characters if not configured
[2024-08-09 08:57:42,776] INFO: Successfully created data push job with id 'a0aee63e-1997-4116-b666-4780cf6d20e8'
[2024-08-09 08:57:42,777] INFO: Add data frame as file chunks to data push job with id 'a0aee63e-1997-4116-b666-4780cf6d20e8'
[2024-08-09 08:57:42,795] INFO: Successfully upserted file chunk to data push job with id 'a0aee63e-1997-4116-b666-4780cf6d20e8'
[2024-08-09 08:57:42,820] INFO: Successfully triggered execution for data push job with id 'a0aee63e-1997-4116-b666-4780cf6d20e8'
[2024-08-09 08:57:42,820] INFO: Wait for execution of data push job with id 'a0aee63e-1997-4116-b666-4780cf6d20e8'
[2024-08-09 08:57:42,919] INFO: Successfully deleted data push job with id 'a0aee63e-1997-4116-b666-4780cf6d20e8'
[2024-08-09 08:57:42,920] INFO: Successfully upserted rows to table 'ACTIVITIES_UPSERT' in data pool
4. Add table to a data model¶
After having uploaded our data into the data pool (either as a new table or by appending/upserting into an existing one), we can add the table into a data model.
For this, we navigate to the data model, which we created in the Data Integration - Introduction tutorial:
data_model = data_pool.get_data_models().find("PyCelonis Tutorial Data Model")
data_model
DataModel(id='a13de415-ab63-4faf-a392-daab5ec74795', name='PyCelonis Tutorial Data Model', pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a')
To add a table from the data pool into the data model, we have to call the add_table()
method. The method takes as input arguments:
Name | Type | Description | Default |
---|---|---|---|
name |
str |
Name of the table inside our data pool | Required |
alias |
str |
Alias for the table, i.e. how the name should be displayed inside our data model | None |
data_model.add_table(name="ACTIVITIES", alias="ACTIVITIES")
data_model.add_table(name="EKPO", alias="EKPO")
data_model.add_table(name="EKKO", alias="EKKO")
data_model.add_table(name="LFA1", alias="LFA1")
[2024-08-09 08:57:42,963] INFO: Successfully added data model table with id '4ee86b33-ba79-4e87-95ee-3e41334048e8' to data model
[2024-08-09 08:57:42,978] INFO: Successfully added data model table with id 'f863f7b8-ae79-4c0f-b04f-185d8d292ef8' to data model
[2024-08-09 08:57:42,993] INFO: Successfully added data model table with id '3ee2ba17-1186-4578-8aef-e4bc1e6c4e8f' to data model
[2024-08-09 08:57:43,007] INFO: Successfully added data model table with id 'd337a128-439d-4b15-871b-90f1698edfa1' to data model
DataModelTable(id='d337a128-439d-4b15-871b-90f1698edfa1', data_model_id='a13de415-ab63-4faf-a392-daab5ec74795', name='LFA1', alias='LFA1', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a')
The method will take the table from our data pool and create a reference to it inside our data model, including all properties, such as column names and data types. To verify that the tables exist in our data model, we can call the get_tables()
method:
data_model.get_tables()
[ DataModelTable(id='4ee86b33-ba79-4e87-95ee-3e41334048e8', data_model_id='a13de415-ab63-4faf-a392-daab5ec74795', name='ACTIVITIES', alias='ACTIVITIES', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a'), DataModelTable(id='f863f7b8-ae79-4c0f-b04f-185d8d292ef8', data_model_id='a13de415-ab63-4faf-a392-daab5ec74795', name='EKPO', alias='EKPO', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a'), DataModelTable(id='3ee2ba17-1186-4578-8aef-e4bc1e6c4e8f', data_model_id='a13de415-ab63-4faf-a392-daab5ec74795', name='EKKO', alias='EKKO', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a'), DataModelTable(id='d337a128-439d-4b15-871b-90f1698edfa1', data_model_id='a13de415-ab63-4faf-a392-daab5ec74795', name='LFA1', alias='LFA1', data_pool_id='1bcc67dc-935a-4b7f-940a-9c0b81a7135a') ]
5. Reload data model¶
Note that the add_table()
method only creates a reference of the data pool table inside our data model but does not reload the data model. For that, we have to call the reload()
method:
data_model.reload()
[2024-08-09 08:57:43,068] INFO: Successfully triggered data model reload for data model with id 'a13de415-ab63-4faf-a392-daab5ec74795'
[2024-08-09 08:57:43,069] INFO: Wait for execution of data model reload for data model with id 'a13de415-ab63-4faf-a392-daab5ec74795'
This method will load the data for all tables inside our data model. However, if we only want to load the data for selected tables, we can also perform a partial_reload()
and specify inside data_model_table_ids
the table IDs, for which we want to load the data:
tables = data_model.get_tables()
ekko = tables.find("EKKO")
ekpo = tables.find("EKPO")
data_model.partial_reload(data_model_table_ids=[ekko.id, ekpo.id])
[2024-08-09 08:57:43,247] INFO: Successfully triggered data model reload for data model with id 'a13de415-ab63-4faf-a392-daab5ec74795'
[2024-08-09 08:57:43,247] INFO: Wait for execution of data model reload for data model with id 'a13de415-ab63-4faf-a392-daab5ec74795'
Conclusion¶
Congratulations! You have learned how to upload data from your local Python project as tables inside your data pool, how to add those tables into a data model, and how to reload a data model in order to populate the model tables with data from the pool tables. In the next tutorial Data Export, you will learn how to export data from the Celonis EMS into your local Python project.