How to implement get_metadata
The get_metadata method returns an output that defines the table(s) and columns which your Python Extractor extracts. The output is returned as a list of tables (list of lists) where each “sub-list” contains a specific table’s name, as well as the name and data type (in Java format) of its columns. The snippet below demonstrates the basic structure of get_metadata and its output.
Here, we define two tables: “users” and “departments”. Then, we define the columns they contain. The “users” table contains three columns; the first of them is called “name” and its data type is string. Note that the Table and Column objects are directly imported from the Celoxtractor module.
from celoxtractor.types import Table, Column class MyExtractor(CelonisExtractor): def get_metadata(self): # Define the name of the first table users_table = Table("users") # Define the names and data types of columns included in the first table users_table.columns = [Column("name", "STRING"), Column("user_id", "INTEGER"), Column("birthDate", "DATETIME")] # Define the name of the second table departments_table = Table("departments") # Define the names and data types of columns included in the second table departments_table.columns = [Column("department_name", "STRING"), Column("department_id", "INTEGER")] # Return a list of tables (list of lists) return [users_table, departments_table]
- After having set up get_metadata according to the data you want to extract, Celoxtractor does all the lifting for you. Without further ado, viewing and configuring tables and columns for your extraction becomes a native IBC experience:
Advanced: How to use dynamic metadata
Your metadata does not have to be static. Celoxtractor equally supports dynamic metadata. For example, the below snippet dynamically defines the “sys_id” column for every table that can be retrieved via the API (here: ServiceNow REST API).
from celoxtractor.extractor import CelonisExtractor from celoxtractor.types import Table, Column import requests from requests.auth import HTTPBasicAuth class ServiceNowSysIdExtractor(CelonisExtractor): def get_metadata(self): # Initiate the list of lists that will contain the metadata tables = [] # Define your API request response = requests.get('https://your-instance.service-now.com/api/now/table/sys_db_object', auth = HTTPBasicAuth(parameters["username"], parameters["password"])).json() # Add each table which you can retrieve via the API request, as well as its "sys_id" column to the metadata list of lists for record in list(response.values())[0]: table = Table(record["name"]) table.columns = [Column("sys_id", "STRING")] tables.append(table) # Return the list of lists containing metadata for all tables return tables