data_pool_table
Module to interact with data pool tables.
This module contains class to interact with data pool tables in EMS data integration.
Typical usage example:
```python
tables = data_pool.get_tables()
data_pool_table = data_pool.create_table(df, "TEST_TABLE")
data_pool_table.append(df)
data_pool_table.upsert(df, keys=["PRIMARY_KEY_COLUMN"])
```
DataPoolTable ¶
Bases: PoolTable
Data model table object to interact with data model table specific data integration endpoints.
loader_source
class-attribute
instance-attribute
¶
data_source_name
class-attribute
instance-attribute
¶
from_transport
classmethod
¶
Creates high-level data pool table object from given PoolTable.
Parameters:
-
client
(Client
) –Client to use to make API calls for given data pool table.
-
data_pool_id
(str
) –Id of data pool where table is located
-
pool_table_transport
(PoolTable
) –PoolTable object containing properties of data pool table.
Returns:
-
DataPoolTable
–A DataPoolTable object with properties from transport and given client.
get_columns ¶
Gets all table columns of given table.
Returns:
-
CelonisCollection[Optional[PoolColumn]]
–A list containing all columns of table.
upsert ¶
Upserts data frame to existing table in data pool.
Parameters:
-
df
(DataFrame
) –DataFrame to push to existing table.
-
keys
(List[str]
) –Primary keys of table.
-
chunk_size
(int
, default:100000
) –Number of rows to push in one chunk.
-
index
(Optional[bool]
, default:False
) –Whether index is included in parquet file that is pushed. Default False. See pandas documentation.
-
column_config
(Optional[dict]
, default:None
) –Configuration for the columns.
-
**kwargs
(Any
, default:{}
) –Additional parameters set for DataPushJob object.
Returns:
-
None
–The updated table object.
Raises:
-
PyCelonisTableDoesNotExistError
–Raised if table does not exist in data pool.
-
PyCelonisDataPushExecutionFailedError
–Raised when table creation fails.
Examples:
Upsert new data to table:
append ¶
Appends data frame to existing table in data pool.
Parameters:
-
df
(DataFrame
) –DataFrame to push to existing table.
-
chunk_size
(int
, default:100000
) –Number of rows to push in one chunk.
-
index
(Optional[bool]
, default:False
) –Whether index is included in parquet file that is pushed. Default False. See pandas documentation.
-
column_config
(Optional[dict]
, default:None
) –Configuration for the columns.
-
**kwargs
(Any
, default:{}
) –Additional parameters set for NewTaskInstanceTransport object.
Returns:
-
None
–The updated table object.
Raises:
-
PyCelonisTableDoesNotExistError
–Raised if table does not exist in data pool.
-
PyCelonisDataPushExecutionFailedError
–Raised when table creation fails.
Examples:
Append new data to table: