data_push_job

Module to interact with data push jobs.

This module contains class to interact with a data push job in EMS data integration.

Typical usage example

data_push_job = data_pool.create_data_push_job("TABLE_NAME", JobType.REPLACE)
data_push_job.add_file_chunk(file)
data_push_job.execute()

DataPushJob ¶

Bases: DataPushJobBase

Data push job object to interact with data push job specific data integration endpoints.

client `instance-attribute` `class-attribute` ¶

client: Client = Field(Ellipsis, exclude=True)

id `instance-attribute` ¶

id: str

Id of data push job.

data_pool_id `instance-attribute` ¶

data_pool_id: str

Id of data pool where data push job is located.

target_name `instance-attribute` ¶

target_name: typing.Optional[str]

Name of table where data is pushed to.

connection_id `instance-attribute` ¶

connection_id: typing.Optional[str]

Id of data connection where data is pushed to.

from_transport `classmethod` ¶

from_transport(client, data_push_job_transport)

Creates high-level data push job object from given DataPushJobTransport.

Parameters:

client (Client) –

Client to use to make API calls for given data push job.
data_push_job_transport (DataPushJobBase) –

DataPushJobTransport object containing properties of data push job.

Returns:

DataPushJob –

A DataPushJob object with properties from transport and given client.

sync ¶

sync()

Syncs data push job properties with EMS.

delete ¶

delete()

Deletes data push job.

add_file_chunk ¶

add_file_chunk(file)

Adds file chunk to data push job.

Parameters:

file (typing.Union[io.BytesIO, io.BufferedReader]) –

File stream to be upserted within data push job.

Examples:

Create data push job to replace table:

data_push_job = data_pool.create_data_push_job(
    target_name="ACTIVITIES",
    type_=JobType.REPLACE
)

with open("ACTIVITIES.parquet", "rb") as file:
    data_push_job.add_file_chunk(file)

data_push_job.execute()

delete_file_chunk ¶

delete_file_chunk(file)

Deletes file chunk from data push job.

Parameters:

file (typing.Union[io.BytesIO, io.BufferedReader]) –

File stream to be deleted within data push job.

Examples:

Create data push job to delete table:

data_push_job = data_pool.create_data_push_job(
    target_name="ACTIVITIES",
    type_=JobType.DELTA,
    keys=["_CASE_KEY"]
)

with open("ACTIVITIES.parquet", "rb") as file:
    data_push_job.delete_file_chunk(file)

data_push_job.execute()

add_data_frame ¶

add_data_frame(df, chunk_size=100000, index=False)

Splits data frame into chunks of size chunk_size and adds each chunk to data push job.

Parameters:

df (pd.DataFrame) –

Data frame to push with given data push job.
chunk_size (int) –

Number of rows for each chunk.
index (typing.Optional[bool]) –

Whether index is included in parquet file. See pandas documentation.

Examples:

Create data push job to replace table:

data_push_job = data_pool.create_data_push_job(
    target_name="ACTIVITIES",
    type_=JobType.REPLACE
)

df = pd.DataFrame({"ID": ["aa", "bb", "cc"], "TEST_COLUMN": [1, 2, 3]})
data_push_job.add_data_frame(df)

data_push_job.execute()

delete_data_frame ¶

delete_data_frame(df, chunk_size=100000, index=False)

Splits data frame into chunks of size chunk_size and deletes each chunk to data push job.

Parameters:

df (pd.DataFrame) –

Data frame to push with given data push job.
chunk_size (int) –

Number of rows for each chunk.
index (typing.Optional[bool]) –

Whether index is included in parquet file. See pandas documentation.

Examples:

Create data push job to delete table:

data_push_job = data_pool.create_data_push_job(
    target_name="ACTIVITIES",
    type_=JobType.DELTA,
    keys=["_CASE_KEY"]
)

df = pd.DataFrame({"_CASE_KEY": ["aa", "bb", "cc"], "TEST_COLUMN": [1, 2, 3]})
data_push_job.delete_data_frame(df)

data_push_job.execute()

get_chunks ¶

get_chunks()

Gets all chunks of given data push job.

Returns:

CelonisCollection[typing.Optional[DataPushChunk]] –

A list containing all chunks.

execute ¶

execute(wait=True)

Execute given data push job.

Parameters:

wait (bool) –

If true, function only returns once data push job has been executed and raises error if reload fails. If false, function returns after triggering execution and does not raise errors in case it failed.

Raises:

PyCelonisDataPushJobNotNew –

Data push job has already been executed and can't be executed again.
PyCelonisDataPushExecutionFailedError –

Data push job execution failed. Only triggered if wait=True.

Examples:

Create data push job to replace table:

data_push_job = data_pool.create_data_push_job(
    target_name="ACTIVITIES",
    type_=JobType.REPLACE
)

df = pd.DataFrame({"ID": ["aa", "bb", "cc"], "TEST_COLUMN": [1,2, 3]})
data_push_job.add_data_frame(df)

data_push_job.execute()

data_push_job

DataPushJob ¶

client instance-attribute class-attribute ¶

id instance-attribute ¶

data_pool_id instance-attribute ¶

target_name instance-attribute ¶

connection_id instance-attribute ¶

from_transport classmethod ¶

sync ¶

delete ¶

add_file_chunk ¶

delete_file_chunk ¶

add_data_frame ¶

delete_data_frame ¶

get_chunks ¶

execute ¶

client `instance-attribute` `class-attribute` ¶

id `instance-attribute` ¶

data_pool_id `instance-attribute` ¶

target_name `instance-attribute` ¶

connection_id `instance-attribute` ¶

from_transport `classmethod` ¶