data_pool
Module to interact with data pools.
This module contains class to interact with a data pool in EMS Data integration.
Typical usage example:
```python
data_pool = celonis.data_integration.get_data_pool(data_pool_id)
data_pool.name = "NEW_NAME"
data_pool.update()
data_pool.delete()
```
DataPool ¶
Bases: DataPoolTransport
Data pool object to interact with data pool specific data integration endpoints.
configuration_status
class-attribute
instance-attribute
¶
content_version
class-attribute
instance-attribute
¶
monitoring_target
class-attribute
instance-attribute
¶
custom_monitoring_target
class-attribute
instance-attribute
¶
custom_monitoring_target_active
class-attribute
instance-attribute
¶
monitoring_message_columns_migrated
class-attribute
instance-attribute
¶
creator_user_id
class-attribute
instance-attribute
¶
from_transport
classmethod
¶
Creates high-level data pool object from given DataPoolTransport.
Parameters:
-
client
(Client
) –Client to use to make API calls for given data pool.
-
data_pool_transport
(DataPoolTransport
) –DataPoolTransport object containing properties of data pool.
Returns:
-
DataPool
–A DataPool object with properties from transport and given client.
update ¶
Pushes local changes of data pool to EMS and updates properties with response from EMS.
copy_to ¶
Copies data pool to the specified domain in the same realm.
Parameters:
-
destination_team_domain
(str
) –The
of the destination team url: https:// . .celonis.cloud/ -
selected_data_models
(Optional[List]
, default:None
) –A list of data model ids to include in the copy operation. By default, all data models are copied.
-
**kwargs
(Any
, default:{}
) –Additional parameters set for MoveDataPoolRequest
Returns:
-
DataPoolTransport
–A read-only data pool transport object of the copied asset.
Examples:
create_data_model ¶
Creates new data model with name in given data pool.
Parameters:
-
name
(str
) –Name of new data model.
-
**kwargs
(Any
, default:{}
) –Additional parameters set for DataModelTransport object.
Returns:
-
DataModel
–A DataModel object for newly created data model.
Examples:
Create a data model and add tables:
get_data_model ¶
Gets data model with given id.
Parameters:
-
id_
(str
) –Id of data model.
Returns:
-
DataModel
–A DataModel object for data model with given id.
get_data_models ¶
Gets all data models of given data pool.
Returns:
-
CelonisCollection[DataModel]
–A list containing all data models.
create_data_push_job_from
staticmethod
¶
create_data_push_job_from(
client,
data_pool_id,
target_name,
type_=None,
column_config=None,
keys=None,
**kwargs
)
Creates new data push job in given data pool.
Parameters:
-
client
(Client
) –Client to use to make API calls for data export.
-
data_pool_id
(str
) –Id of data pool where data push job will be created.
-
target_name
(str
) –Table name to which job will push data.
-
type_
(Optional[JobType]
, default:None
) –Type of data push job.
-
column_config
(Optional[List[ColumnTransport]]
, default:None
) –Can be used to specify column types and string field length in number of characters.
-
keys
(Optional[List[str]]
, default:None
) –Primary keys to use in case of upsert data push job.
-
**kwargs
(Any
, default:{}
) –Additional parameters set for DataPushJob object.
Returns:
-
DataPushJob
–The newly created DataPushJob.
Examples:
Create data push job to replace table:
create_data_push_job ¶
create_data_push_job(
target_name,
type_=None,
column_config=None,
keys=None,
connection_id=None,
**kwargs
)
Creates new data push job in given data pool.
Parameters:
-
target_name
(str
) –Table name to which job will push data.
-
type_
(Optional[JobType]
, default:None
) –Type of data push job.
-
column_config
(Optional[List[ColumnTransport]]
, default:None
) –Can be used to specify column types and string field length in number of characters.
-
keys
(Optional[List[str]]
, default:None
) –Primary keys to use in case of upsert data push job.
-
connection_id
(Optional[str]
, default:None
) –Connection id of connection for data push job (Equivalent to data_source_id for pool tables).
-
**kwargs
(Any
, default:{}
) –Additional parameters set for DataPushJob object.
Returns:
-
DataPushJob
–A DataPushJob object for newly created data push job.
Examples:
Create data push job to replace table:
get_data_push_job ¶
Gets data push job with given id.
Parameters:
-
id_
(str
) –Id of data push job.
Returns:
-
DataPushJob
–A DataPushJob object for data push job with given id.
get_data_push_jobs ¶
Gets all data push jobs of given data pool.
Returns:
-
CelonisCollection[DataPushJob]
–A list containing all data push jobs.
create_table ¶
create_table(
df,
table_name,
drop_if_exists=False,
column_config=None,
chunk_size=100000,
force=False,
data_source_id=None,
index=False,
**kwargs
)
Creates new table in given data pool.
Parameters:
-
df
(DataFrame
) –DataFrame to push to new table.
-
table_name
(str
) –Name of new table.
-
drop_if_exists
(bool
, default:False
) –If true, drops existing table if it exists. If false, raises PyCelonisTableAlreadyExistsError if table already exists.
-
column_config
(Optional[List[ColumnTransport]]
, default:None
) –Can be used to specify column types and string field length in number of characters.
-
chunk_size
(int
, default:100000
) –Number of rows to push in one chunk.
-
force
(bool
, default:False
) –If true, replacing table without column config is possible. Otherwise, error is raised if table would be replaced without column config.
-
data_source_id
(Optional[str]
, default:None
) –Id of data connection where table will be created (Equivalent to connection_id for data push jobs).
-
index
(Optional[bool]
, default:False
) –Whether index is included in parquet file that is pushed. Default False. See pandas documentation.
-
**kwargs
(Any
, default:{}
) –Additional parameters set for DataPushJob object.
Returns:
-
DataPoolTable
–The new table object.
Raises:
-
PyCelonisTableAlreadyExistsError
–Raised if drop_if_exists=False and table already exists.
-
PyCelonisDataPushExecutionFailedError
–Raised when table creation fails.
-
PyCelonisValueError
–Raised when table already exists and no column config is given.
Examples:
Create new table:
Replace table:get_table ¶
Gets table located in data pool with given name and data source id.
Parameters:
-
name
(str
) –Name of table.
-
data_source_id
(Optional[str]
, default:None
) –Id of data connection where table is located (Equivalent to connection_id for data push jobs).
Returns:
-
DataPoolTable
–The table object by name and data source id.
Raises:
-
PyCelonisNotFoundError
–Raised if no table with name and data source id exists in given package.
get_tables ¶
Gets all data pool tables of given data pool.
Returns:
-
CelonisCollection[PoolTable]
–A list containing all data pool tables.
get_data_connection ¶
Gets data connection with given id.
Parameters:
-
id_
(str
) –Id of data connection.
Returns:
-
DataConnection
–A DataConnection object for data connection with given id.
get_data_connections ¶
Gets all data connections of given data pool.
Returns:
-
CelonisCollection[DataConnection]
–A list containing all data connections.
create_job ¶
Creates new job with name in given data pool.
Parameters:
-
name
(str
) –Name of new job.
-
data_source_id
(Optional[str]
, default:None
) –Data connection id to use for job scope. (Equivalent to connection_id for data push jobs).
Returns:
-
Job
–A Job object for newly created job.
Examples:
Create data job with transformation statement and execute it:
data_job = data_pool.create_job("PyCelonis Tutorial Job")
task = data_job.create_transformation(
name="PyCelonis Tutorial Task",
description="This is an example task"
)
task.update_statement(\"\"\"
DROP TABLE IF EXISTS ACTIVITIES;
CREATE TABLE ACTIVITIES (
_CASE_KEY VARCHAR(100),
ACTIVITY_EN VARCHAR(300)
);
\"\"\")
data_job.execute()
get_job ¶
Gets job with given id.
Parameters:
-
id_
(str
) –Id of job.
Returns:
-
Job
–A Job object for job with given id.
get_jobs ¶
get_pool_variables ¶
Gets data pool variables.
Returns:
-
CelonisCollection[PoolVariable]
–A list containing all pool variables.
get_pool_variable ¶
Gets pool variable with given id.
Parameters:
-
id_
(str
) –Id of pool variable.
Returns:
-
PoolVariable
–A PoolVariable object for pool variable with given id.
create_pool_variable ¶
create_pool_variable(
name,
placeholder,
description=None,
var_type=VariableType.PUBLIC_CONSTANT,
data_type=FilterParserDataType.STRING,
values=None,
**kwargs
)
Creates and returns newly created pool variable.
Parameters:
-
name
(str
) –name of a variable
-
placeholder
(str
) –placeholder of variable
-
description
(Optional[str]
, default:None
) –description of variable
-
var_type
(VariableType
, default:PUBLIC_CONSTANT
) –type of variable
-
data_type
(FilterParserDataType
, default:STRING
) –type of value of variable
-
values
(Optional[List[VariableValueTransport]]
, default:None
) –list of variables
-
**kwargs
(Any
, default:{}
) –Additional parameters