data_pool_table
Module to interact with data pool tables.
This module contains class to interact with data pool tables in EMS data integration.
Typical usage example:
```python
tables = data_pool.get_tables()
data_pool_table = data_pool.create_table(
df, "TEST_TABLE"
)
data_pool_table.append(
df
)
data_pool_table.upsert(
df,
keys=[
"PRIMARY_KEY_COLUMN"
],
)
```
DataPoolTable ¶
Bases: PoolTable
Data model table object to interact with data model table specific data integration endpoints.
loader_source
class-attribute
instance-attribute
¶
data_source_name
class-attribute
instance-attribute
¶
from_transport
classmethod
¶
Creates high-level data pool table object from given PoolTable.
Parameters:
-
client(Client) –Client to use to make API calls for given data pool table.
-
data_pool_id(str) –Id of data pool where table is located
-
pool_table_transport(PoolTable) –PoolTable object containing properties of data pool table.
Returns:
-
DataPoolTable–A DataPoolTable object with properties from transport and given client.
get_columns ¶
Gets all table columns of given table.
Returns:
-
CelonisCollection[Optional[PoolColumn]]–A list containing all columns of table.
upsert ¶
Upserts data frame to existing table in data pool.
Parameters:
-
df(DataFrame) –DataFrame to push to existing table.
-
keys(List[str]) –Primary keys of table.
-
chunk_size(int, default:100000) –Number of rows to push in one chunk.
-
index(Optional[bool], default:False) –Whether index is included in parquet file that is pushed. Default False. See pandas documentation.
-
column_config(Optional[dict], default:None) –Configuration for the columns.
-
**kwargs(Any, default:{}) –Additional parameters set for DataPushJob object.
Returns:
-
None–The updated table object.
Raises:
-
PyCelonisTableDoesNotExistError–Raised if table does not exist in data pool.
-
PyCelonisDataPushExecutionFailedError–Raised when table creation fails.
Examples:
Upsert new data to table:
append ¶
Appends data frame to existing table in data pool.
Parameters:
-
df(DataFrame) –DataFrame to push to existing table.
-
chunk_size(int, default:100000) –Number of rows to push in one chunk.
-
index(Optional[bool], default:False) –Whether index is included in parquet file that is pushed. Default False. See pandas documentation.
-
column_config(Optional[dict], default:None) –Configuration for the columns.
-
**kwargs(Any, default:{}) –Additional parameters set for NewTaskInstanceTransport object.
Returns:
-
None–The updated table object.
Raises:
-
PyCelonisTableDoesNotExistError–Raised if table does not exist in data pool.
-
PyCelonisDataPushExecutionFailedError–Raised when table creation fails.
Examples:
Append new data to table: