data_pool.py
        
Pool            (CelonisApiObject)
        
¶
    Pool object to interact to interact with Celonis Event Collection API.
url: str
  
      property
      readonly
  
¶
    API
- /integration/api/pools/{pool_id}
url_data_push
  
      property
      readonly
  
¶
    API
- /integration/api/v1/data-push/{pool_id}/jobs/
url_connection_creation
  
      property
      readonly
  
¶
    API
- /integration/api/datasource/
datamodels: CelonisCollection
  
      property
      readonly
  
¶
    Get all Datamodels of the Pool.
API
- GET: /integration/api/pools/{pool_id}/data-models
Returns:
| Type | Description | 
|---|---|
| CelonisCollection | Collection of Pool Datamodels. | 
tables: List[Dict]
  
      property
      readonly
  
¶
    Get all Pool Tables.
API
- GET: /integration/api/pools/{pool_id}/tables
Returns:
| Type | Description | 
|---|---|
| List[Dict] | A List of dictionaries containing Pool tables. | 
data_connections: CelonisCollection[DataConnection]
  
      property
      readonly
  
¶
    Get all Pool Data Connections.
API
- GET: /integration/api/pools/{pool_id}/data-sources/
Returns:
| Type | Description | 
|---|---|
| CelonisCollection[DataConnection] | A Collection of Pool Data Connections. | 
data_jobs: CelonisCollection[DataJob]
  
      property
      readonly
  
¶
    Get all Pool Data Jobs.
API
- GET: /integration/api/pools/{pool_id}/jobs
Returns:
| Type | Description | 
|---|---|
| CelonisCollection[DataJob] | A Collection of Pool Data Jobs. | 
variables: CelonisCollection[PoolParameter]
  
      property
      readonly
  
¶
    Get all Pool Variables.
API
- GET: /integration/api/pools/{pool_id}/variables
Returns:
| Type | Description | 
|---|---|
| CelonisCollection[PoolParameter] | A Collection of Pool Variables. | 
find_table(self, table_name, data_source_id=None)
¶
    Find a Table in the Pool.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| table_name | str | Name of the Pool Table. | required | 
| data_source_id | str | ID of the Data Source. | None | 
Returns:
| Type | Description | 
|---|---|
| Optional[Dict] | The Pool Table, if found. | 
create_table(self, df_or_path, table_name, if_exists='error', column_config=None, connection=None, wait_for_finish=True, chunksize=100000)
¶
    Creates a new Table in the Pool from a pandas.DataFrame or pyarrow.parquet.ParquetFile.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df_or_path | Union[pandas.core.frame.DataFrame, pathlib.Path, str] | 
 | required | 
| table_name | str | Name of Table. | required | 
| if_exists | str | 
 | 'error' | 
| column_config | List[Dict[str, Any]] | Can be used to specify column types and string field length.with columnTypeone of [INTEGER,DATE,TIME,DATETIME,FLOAT,BOOLEAN,STRING]. | None | 
| connection | Union[DataConnection, str] | The Data Connection to upload to, else uploads to Global. | None | 
| wait_for_finish | bool | Waits for the upload to finish processing, set to False to trigger only. | True | 
| chunksize | int | If DataFrame is passed, the value is used to chunk the dataframe into multiple parquet files that are uploaded. If set to a value <1, no chunking is applied. | 100000 | 
Returns:
| Type | Description | 
|---|---|
| Dict | The Data Job Status. | 
Exceptions:
| Type | Description | 
|---|---|
| PyCelonisValueError | If Table already exists and  | 
| PyCelonisTypeError | When connection is not DataConnection object or ID of Data Connection. | 
| PyCelonisTypeError | If Path is not valid a file or folder. | 
append_table(self, df_or_path, table_name, column_config=None, connection=None, wait_for_finish=True, chunksize=100000)
¶
    Appends a pandas.DataFrame or pyarrow.parquet.ParquetFile to an existing Table in the Pool.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df_or_path | Union[pandas.core.frame.DataFrame, pathlib.Path, str] | 
 | required | 
| table_name | str | Name of Table. | required | 
| column_config | List[Dict[str, Any]] | Can be used to specify column types and string field length.with columnTypeone of [INTEGER,DATE,TIME,DATETIME,FLOAT,BOOLEAN,STRING]. | None | 
| connection | Union[DataConnection, str] | The Data Connection to upload to, else uploads to Global. | None | 
| wait_for_finish | bool | Waits for the upload to finish processing, set to False to trigger only. | True | 
| chunksize | int | If DataFrame is passed, the value is used to chunk the dataframe into multiple parquet files that are uploaded. If set to a value <1, no chunking is applied. | 100000 | 
Returns:
| Type | Description | 
|---|---|
| Dict | The Data Job Status. | 
Exceptions:
| Type | Description | 
|---|---|
| PyCelonisValueError | If Table already exists and  | 
| PyCelonisTypeError | When connection is not DataConnection object or ID of Data Connection. | 
| PyCelonisTypeError | If Path is not valid a file or folder. | 
upsert_table(self, df_or_path, table_name, primary_keys, column_config=None, connection=None, wait_for_finish=True, chunksize=100000)
¶
    Upserts the pandas.DataFrame or pyarrow.parquet.ParquetFile an existing Table in the Pool.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df_or_path | Union[pandas.core.frame.DataFrame, pathlib.Path, str] | 
 | required | 
| table_name | str | Name of Table. | required | 
| primary_keys | List[str] | List of Table primary keys. | required | 
| column_config | List[Dict[str, Any]] | Can be used to specify column types and string field length.with columnTypeone of [INTEGER,DATE,TIME,DATETIME,FLOAT,BOOLEAN,STRING]. | None | 
| connection | Union[DataConnection, str] | The Data Connection to upload to, else uploads to Global. | None | 
| wait_for_finish | bool | Waits for the upload to finish processing, set to False to trigger only. | True | 
| chunksize | int | If DataFrame is passed, the value is used to chunk the dataframe into multiple parquet files that are uploaded. If set to a value <1, no chunking is applied. | 100000 | 
Returns:
| Type | Description | 
|---|---|
| Dict | The Data Job Status. | 
Exceptions:
| Type | Description | 
|---|---|
| PyCelonisValueError | If Table already exists and  | 
| PyCelonisTypeError | When connection is not DataConnection object or ID of Data Connection. | 
| PyCelonisTypeError | If Path is not valid a file or folder. | 
push_table(self, df_or_path, table_name, if_exists='error', primary_keys=None, column_config=None, connection=None, wait_for_finish=True, chunksize=100000)
¶
    Pushes a pandas.DataFrame or pyarrow.parquet.ParquetFile to the specified Table in the Pool.
Warning
Deprecation: The method 'push_table' is deprecated and will be removed in the next release. Use one of: create_table, append_table, upsert_table.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df_or_path | Union[pandas.core.frame.DataFrame, pathlib.Path, str] | 
 | required | 
| table_name | str | Name of Table. | required | 
| if_exists | str | 
 | 'error' | 
| primary_keys | Optional[List[str]] | List of Table primary keys. | None | 
| column_config | List[Dict[str, Any]] | Can be used to specify column types and string field length.with columnTypeone of [INTEGER,DATE,TIME,DATETIME,FLOAT,BOOLEAN,STRING]. | None | 
| connection | Union[DataConnection, str] | The Data Connection to upload to, else uploads to Global. | None | 
| wait_for_finish | bool | Waits for the upload to finish processing, set to False to trigger only. | True | 
| chunksize | int | If DataFrame is passed, the value is used to chunk the dataframe into multiple parquet files that are uploaded. If set to a value <1, no chunking is applied. | 100000 | 
Returns:
| Type | Description | 
|---|---|
| Dict | The Data Job Status. | 
get_column_config(self, table, raise_error=False)
¶
    Get a Column Configuration of a Pool Table.
Column Config List:
[
    {'columnName': 'colA', 'columnType': 'DATETIME'},
    {'columnName': 'colB', 'columnType': 'FLOAT'},
    {'columnName': 'colC', 'columnType': 'STRING', 'fieldLength': 80}
]
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| table | Union[str, Dict] | Name of the Pool Table or dictionary with  | required | 
| raise_error | bool | Raises a celonis_api.errors.PyCelonisValueError if Table data types are  | False | 
Returns:
| Type | Description | 
|---|---|
| Optional[List[Dict[str, Any]]] | The Column Configuration of the Pool Table (Always ignoring '_CELONIS_CHANGE_DATE'). | 
check_push_status(self, job_id='')
¶
    Checks the Status of a Data Push Job.
API
- GET: /integration/api/v1/data-push/{pool_id}/jobs/{job_id}
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| job_id | str | The ID of the job to check. If empty returns all job status. | '' | 
Returns:
| Type | Description | 
|---|---|
| Dict | Status of Data Push Job(s). | 
check_data_job_execution_status(self)
¶
    Checks the Status of Data Job Executions.
API
- GET: /integration/api/pools/{pool_id}/logs/status
Returns:
| Type | Description | 
|---|---|
| List | Status of all Data Job Executions. | 
create_datamodel(self, name)
¶
    Creates a new Datamodel in the Pool.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| name | str | Name of the Datamodel. | required | 
Returns:
| Type | Description | 
|---|---|
| Datamodel | The newly created Datamodel object. | 
create_data_connection(self, client, host, password, system_number, user, name, connector_type, uplink_id=None, use_uplink=True, compression_type='GZIP', **kwargs)
¶
    Creates a new Data Connection (Currently, only SAP connection are supported).
Warning
This method is deprecated and will be removed in the next release. Use the online wizard to set up Data Connections.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| client | str | Client. | required | 
| host | str | Host. | required | 
| user | str | Username. | required | 
| password | str | Password. | required | 
| system_number | str | System Number. | required | 
| name | str | Name of the Data Connection. | required | 
| connector_type | str | Type of the Data Connection. One of ['SAP']. | required | 
| uplink_id | str | ID of an Uplink Connection. | None | 
| use_uplink | bool | Whether to use an Uplink Connection or not. | True | 
| compression_type | str | Compression Type. | 'GZIP' | 
| **kwargs | {} | 
Returns:
| Type | Description | 
|---|---|
| DataConnection | The newly created Data Connection. | 
move(self, to)
¶
    Moves the Pool to another team.
API
- POST: /integration/api/pools/move
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| to | str | Name of the host domain (e.g.  | required | 
create_pool_parameter(self, pool_variable=None, name=None, placeholder=None, description=None, data_type='STRING', var_type='PUBLIC_CONSTANT', values=None)
¶
    Creates a new Variable with the specified properties in the Pool.
API
- POST: /integration/api/pools/{pool_id}/variables/
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| pool_variable | Union[Dict, PoolParameter] | Pool Parameter object or dictionary (see API), if  | None | 
| name | str | Name of the Variable (same as  | None | 
| placeholder | str | Placeholder of the Variable. | None | 
| description | str | Description of the Variable. | None | 
| data_type | str | Data type of the Variable (see options  | 'STRING' | 
| var_type | str | Type of the Variable (see options  | 'PUBLIC_CONSTANT' | 
| values | List | List of Variable values. | None | 
Returns:
| Type | Description | 
|---|---|
| PoolParameter | The newly create Pool Parameter object. | 
create_data_job(self, name, data_source_id=None)
¶
    Creates a new Data Job with the specified name in the Pool.
API
- POST: /integration/api/pools/{pool_id}/jobs/
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| name | str | Name of the Data Job. | required | 
| data_source_id | str | ID of the Data Source that the new Data Job will be connected to. If not specified, the default global source will be connected to. | None | 
Returns:
| Type | Description | 
|---|---|
| DataJob | The newly created Data Job object. |