Skip to content

Job

laktory.models.resources.databricks.Job ¤

Bases: BaseModel, PulumiResource, TerraformResource

Databricks Job

ATTRIBUTE DESCRIPTION
access_controls

Access Controls specifications

TYPE: list[AccessControl]

clusters

A list of job databricks.Cluster specifications that can be shared and reused by tasks of this job. Libraries cannot be declared in a shared job cluster. You must declare dependent libraries in task settings.

TYPE: list[JobCluster]

continuous

Continuous specifications

TYPE: JobContinuous

control_run_state

If True, the Databricks provider will stop and start the job as needed to ensure that the active run for the job reflects the deployed configuration. For continuous jobs, the provider respects the pause_status by stopping the current active run. This flag cannot be set for non-continuous jobs.

TYPE: bool

description

An optional description for the job. The maximum length is 1024 characters in UTF-8 encoding.

TYPE: str

email_notifications

An optional set of email addresses notified when runs of this job begins, completes or fails. The default behavior is to not send any emails. This field is a block and is documented below.

TYPE: JobEmailNotifications

format

TYPE: str

health

Health specifications

TYPE: JobHealth

lookup_existing

Specifications for looking up existing resource. Other attributes will be ignored.

TYPE: JobLookup

max_concurrent_runs

An optional maximum allowed number of concurrent runs of the job. Defaults to 1.

TYPE: int

max_retries

An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a FAILED or INTERNAL_ERROR lifecycle state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry. A run can have the following lifecycle state: PENDING, RUNNING, TERMINATING, TERMINATED, SKIPPED or INTERNAL_ERROR.

TYPE: int

min_retry_interval_millis

An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.

TYPE: int

name

Name of the job

TYPE: str

name_prefix

Prefix added to the job name

TYPE: str

name_suffix

Suffix added to the job name

TYPE: str

notification_settings

Notifications specifications

TYPE: JobNotificationSettings

parameters

Parameters specifications

TYPE: list[JobParameter]

retry_on_timeout

An optional policy to specify whether to retry a job when it times out. The default behavior is to not retry on timeout.

TYPE: bool

run_as

Run as specifications

TYPE: JobRunAs

schedule

Schedule specifications

TYPE: JobSchedule

tags

Tags as key, value pairs

TYPE: dict[str, Any]

tasks

Tasks specifications

TYPE: list[JobTask]

timeout_seconds

An optional timeout applied to each run of this job. The default behavior is to have no timeout.

TYPE: int

trigger

Trigger specifications

TYPE: JobTrigger

webhook_notifications

Webhook notifications specifications

TYPE: JobWebhookNotifications

Examples:

import io
from laktory import models

# Define job
job_yaml = '''
name: job-stock-prices
clusters:
  - name: main
    spark_version: 14.0.x-scala2.12
    node_type_id: Standard_DS3_v2

tasks:
  - task_key: ingest
    job_cluster_key: main
    notebook_task:
      notebook_path: /jobs/ingest_stock_prices.py
    libraries:
      - pypi:
          package: yfinance

  - task_key: pipeline
    depends_ons:
      - task_key: ingest
    pipeline_task:
      pipeline_id: 74900655-3641-49f1-8323-b8507f0e3e3b

access_controls:
  - group_name: account users
    permission_level: CAN_VIEW
  - group_name: role-engineers
    permission_level: CAN_MANAGE_RUN
'''
job = models.resources.databricks.Job.model_validate_yaml(io.StringIO(job_yaml))

# Define job with for each task
job_yaml = '''
name: job-hello
tasks:
  - task_key: hello-loop
    for_each_task:
      inputs:
        - id: 1
          name: olivier
        - id: 2
          name: kubic
      task:
        task_key: hello-task
        notebook_task:
          notebook_path: /Workspace/Users/olivier.soucy@okube.ai/hello-world
          base_parameters:
            input: "{{input}}"
'''
job = models.resources.databricks.Job.model_validate_yaml(io.StringIO(job_yaml))
References
ATTRIBUTE DESCRIPTION
additional_core_resources
  • permissions

TYPE: list[PulumiResource]

Attributes¤

additional_core_resources property ¤

additional_core_resources
  • permissions

laktory.models.resources.databricks.job.JobCluster ¤

Bases: Cluster

Job Cluster. Same attributes as laktory.models.Cluster, except for

  • access_controls
  • is_pinned
  • libraries
  • no_wait

that are not allowed.


laktory.models.resources.databricks.job.JobContinuous ¤

Bases: BaseModel

Job Continuous specifications

ATTRIBUTE DESCRIPTION
pause_status

Indicate whether this continuous job is paused or not. When the pause_status field is omitted in the block, the server will default to using UNPAUSED as a value for pause_status.

TYPE: Union[Literal['PAUSED', 'UNPAUSED'], str]


laktory.models.resources.databricks.job.JobEmailNotifications ¤

Bases: BaseModel

Job Email Notifications specifications

ATTRIBUTE DESCRIPTION
no_alert_for_skipped_runs

If True, don't send alert for skipped runs. (It's recommended to use the corresponding setting in the notification_settings configuration block).

TYPE: bool

on_duration_warning_threshold_exceededs

List of emails to notify when the duration of a run exceeds the threshold specified by the RUN_DURATION_SECONDS metric in the health block.

TYPE: list[str]

on_failures

List of emails to notify when the run fails.

TYPE: list[str]

on_starts

List of emails to notify when the run starts.

TYPE: list[str]

on_successes

List of emails to notify when the run completes successfully.

TYPE: list[str]


laktory.models.resources.databricks.job.JobHealthRule ¤

Bases: BaseModel

Job Health Rule specifications

ATTRIBUTE DESCRIPTION
metric

Metric to check. The only supported metric is RUN_DURATION_SECONDS (check Jobs REST API documentation for the latest information).

TYPE: str

op

Operation used to compare operands. Currently, following operators are supported: EQUAL_TO, GREATER_THAN, GREATER_THAN_OR_EQUAL, LESS_THAN, LESS_THAN_OR_EQUAL, NOT_EQUAL.

TYPE: str

value

Value used to compare to the given metric.

TYPE: int


laktory.models.resources.databricks.job.JobHealth ¤

Bases: BaseModel

Job Health specifications

ATTRIBUTE DESCRIPTION
rules

Job health rules specifications

TYPE: list[JobHealthRule]


laktory.models.resources.databricks.job.JobNotificationSettings ¤

Bases: BaseModel

Job Notification Settings specifications

ATTRIBUTE DESCRIPTION
no_alert_for_canceled_runs

If True, don't send alert for cancelled runs.

TYPE: bool

no_alert_for_skipped_runs

If True, don't send alert for skipped runs.

TYPE: bool


laktory.models.resources.databricks.job.JobParameter ¤

Bases: BaseModel

Job Parameter specifications

ATTRIBUTE DESCRIPTION
default

Default value of the parameter.

TYPE: str

name

The name of the defined parameter. May only contain alphanumeric characters, _, -, and .,

TYPE: str


laktory.models.resources.databricks.job.JobRunAs ¤

Bases: BaseModel

Job Parameter specifications

ATTRIBUTE DESCRIPTION
service_principal_name

The application ID of an active service principal. Setting this field requires the servicePrincipal/user role.

TYPE: str

user_name

The email of an active workspace user. Non-admin users can only set this field to their own email.

TYPE: str


laktory.models.resources.databricks.job.JobSchedule ¤

Bases: BaseModel

Job Schedule specifications

ATTRIBUTE DESCRIPTION
quartz_cron_expression

A Cron expression using Quartz syntax that describes the schedule for a job. This field is required.

TYPE: str

timezone_id

A Java timezone ID. The schedule for a job will be resolved with respect to this timezone. See Java TimeZone for details. This field is required.

TYPE: str

pause_status

Indicate whether this schedule is paused or not. When the pause_status field is omitted and a schedule is provided, the server will default to using UNPAUSED as a value for pause_status.

TYPE: Union[Literal['PAUSED', 'UNPAUSED'], str, None]


laktory.models.resources.databricks.job.JobTaskConditionTask ¤

Bases: BaseModel

Job Task Condition Task specifications

ATTRIBUTE DESCRIPTION
left

The left operand of the condition task. It could be a string value, job state, or a parameter reference.

TYPE: str

op

The string specifying the operation used to compare operands. This task does not require a cluster to execute and does not support retries or notifications.

TYPE: Literal['EQUAL_TO', 'GREATER_THAN', 'GREATER_THAN_OR_EQUAL', 'LESS_THAN', 'LESS_THAN_OR_EQUAL', 'NOT_EQUAL']

right

The right operand of the condition task. It could be a string value, job state, or parameter reference.

TYPE: str


laktory.models.resources.databricks.job.JobTaskDependsOn ¤

Bases: BaseModel

Job Task Depends On specifications

ATTRIBUTE DESCRIPTION
task_key

The name of the task this task depends on.

TYPE: str

outcome

Can only be specified on condition task dependencies. The outcome of the dependent task that must be met for this task to run.

TYPE: Literal['true', 'false']


laktory.models.resources.databricks.job.JobTaskNotebookTask ¤

Bases: BaseModel

Job Task Notebook Task specifications

ATTRIBUTE DESCRIPTION
notebook_path

The path of the databricks.Notebook to be run in the Databricks workspace or remote repository. For notebooks stored in the Databricks workspace, the path must be absolute and begin with a slash. For notebooks stored in a remote repository, the path must be relative.

TYPE: str

base_parameters

Base parameters to be used for each run of this job. If the run is initiated by a call to run-now with parameters specified, the two parameters maps will be merged. If the same key is specified in base_parameters and in run-now, the value from run-now will be used. If the notebook takes a parameter that is not specified in the job’s base_parameters or the run-now override parameters, the default value from the notebook will be used. Retrieve these parameters in a notebook using dbutils.widgets.get.

TYPE: dict[str, Any]

warehouse_id

The id of the SQL warehouse to execute this task. If a warehouse_id is specified, that SQL warehouse will be used to execute SQL commands inside the specified notebook.

TYPE: str

source

Location type of the notebook, can only be WORKSPACE or GIT. When set to WORKSPACE, the notebook will be retrieved from the local Databricks workspace. When set to GIT, the notebook will be retrieved from a Git repository defined in git_source. If the value is empty, the task will use GIT if git_source is defined and WORKSPACE otherwise.

TYPE: Literal['WORKSPACE', 'GIT']


laktory.models.resources.databricks.job.JobTaskPipelineTask ¤

Bases: BaseModel

Job Task Pipeline specifications

ATTRIBUTE DESCRIPTION
pipeline_id

The pipeline's unique ID.

TYPE: str

full_refresh

Specifies if there should be full refresh of the pipeline.

TYPE: bool


laktory.models.resources.databricks.job.JobTaskRunJobTask ¤

Bases: BaseModel

Job Task Run Job Task specifications

ATTRIBUTE DESCRIPTION
job_id

ID of the job

TYPE: Union[int, str]

job_parameters

Job parameters for the task

TYPE: dict[str, Any]


laktory.models.resources.databricks.job.JobTaskSqlTaskQuery ¤

Bases: BaseModel

Job Task SQL Task specifications

ATTRIBUTE DESCRIPTION
query_id

Query ID

TYPE: str


laktory.models.resources.databricks.job.JobTaskSqlTaskAlertSubscription ¤

Bases: BaseModel

Job Task SQL Task Alert Subscription specifications

ATTRIBUTE DESCRIPTION
destination_id

TYPE: str

user_name

The email of an active workspace user. Non-admin users can only set this field to their own email.

TYPE: str


laktory.models.resources.databricks.job.JobTaskSQLTaskAlert ¤

Bases: BaseModel

Job Task SQL Task Alert specifications

ATTRIBUTE DESCRIPTION
alert_id

Identifier of the Databricks SQL Alert.

TYPE: str

subscriptions

A list of subscription blocks consisting out of one of the required fields: user_name for user emails or destination_id - for Alert destination's identifier.

TYPE: list[JobTaskSqlTaskAlertSubscription]

pause_subscriptions

It True subscriptions are paused

TYPE: bool


laktory.models.resources.databricks.job.JobTaskSqlTaskDashboard ¤

Bases: BaseModel

Job Task SQL Task Dashboard specifications

ATTRIBUTE DESCRIPTION
dashboard_id

identifier of the Databricks SQL Dashboard databricks_sql_dashboard.

TYPE: str

custom_subject

Custom subject specifications

TYPE: list[JobTaskSqlTaskAlertSubscription]

subscriptions

Subscriptions specifications

TYPE: list[JobTaskSqlTaskAlertSubscription]


laktory.models.resources.databricks.job.JobTaskSqlTaskFile ¤

Bases: BaseModel

Job Task SQL Task File specifications

ATTRIBUTE DESCRIPTION
path

If source is GIT: Relative path to the file in the repository specified in the git_source block with SQL commands to execute. If source is WORKSPACE: Absolute path to the file in the workspace with SQL commands to execute.

TYPE: str

source

The source of the project. Possible values are WORKSPACE and GIT.

TYPE: Literal['WORKSPACE', 'GIT']


laktory.models.resources.databricks.job.JobTaskSQLTask ¤

Bases: BaseModel

Job Task SQL Task specifications

ATTRIBUTE DESCRIPTION
alert

Alert specifications

TYPE: JobTaskSQLTaskAlert

dashboard

Dashboard specifications

TYPE: JobTaskSqlTaskDashboard

file

File specifications

TYPE: JobTaskSqlTaskFile

parameters

Parameters specifications

TYPE: dict[str, Any]

query

Query specifications

TYPE: JobTaskSqlTaskQuery

warehouse_id

Warehouse id

TYPE: str


laktory.models.resources.databricks.job.JobTaskSQLTask ¤

Bases: BaseModel

Job Task SQL Task specifications

ATTRIBUTE DESCRIPTION
alert

Alert specifications

TYPE: JobTaskSQLTaskAlert

dashboard

Dashboard specifications

TYPE: JobTaskSqlTaskDashboard

file

File specifications

TYPE: JobTaskSqlTaskFile

parameters

Parameters specifications

TYPE: dict[str, Any]

query

Query specifications

TYPE: JobTaskSqlTaskQuery

warehouse_id

Warehouse id

TYPE: str


laktory.models.resources.databricks.job.JobTaskForEachTask ¤

Bases: BaseModel

For Each Task specifications

ATTRIBUTE DESCRIPTION
inputs

Array for task to iterate on. This can be a JSON string or a reference to an array parameter. Laktory also supports a list input, which wil be serialized.

TYPE: Union[str, list]

task

Task to run against the inputs list.

TYPE: JobTaskForEachTaskTask

concurrency

Controls the number of active iteration task runs. Default is 20, maximum allowed is 100.

TYPE: int


laktory.models.resources.databricks.job.JobTaskForEachTaskTask ¤

Bases: BaseModel

Job Task specifications

ATTRIBUTE DESCRIPTION
condition_task

Condition Task specifications

TYPE: JobTaskConditionTask

depends_ons

Depends On specifications

TYPE: list[JobTaskDependsOn]

description

specifications

TYPE: str

email_notifications

Email Notifications specifications

TYPE: JobEmailNotifications

existing_cluster_id

Cluster id from one of the clusters available in the workspace

TYPE: str

health

Job Health specifications

TYPE: JobHealth

job_cluster_key

Identifier that can be referenced in task block, so that cluster is shared between tasks

TYPE: str

libraries

Cluster Library specifications

TYPE: list[ClusterLibrary]

max_retries

An optional maximum number of times to retry an unsuccessful run.

TYPE: int

min_retry_interval_millis

An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.

TYPE: int

notebook_task

Notebook Task specifications

TYPE: JobTaskNotebookTask

notification_settings

Notification Settings specifications

TYPE: JobNotificationSettings

pipeline_task

Pipeline Task specifications

TYPE: JobTaskPipelineTask

retry_on_timeout

If True, retry a job when it times out. The default behavior is to not retry on timeout.

TYPE: bool

run_if

An optional value indicating the condition that determines whether the task should be run once its dependencies have been completed. When omitted, defaults to ALL_SUCCESS.

TYPE: str

run_job_task

Run Job specifications

TYPE: JobTaskRunJobTask

sql_task

SQL Task specifications

TYPE: JobTaskSQLTask

task_key

A unique key for a given task.

TYPE: str

timeout_seconds

An optional timeout applied to each run of this job. The default behavior is to have no timeout.

TYPE: int


laktory.models.resources.databricks.job.JobTask ¤

Bases: JobTaskForEachTaskTask

Job Task specifications

ATTRIBUTE DESCRIPTION
for_each_task

For each task configuration

TYPE: JobTaskForEachTask


laktory.models.resources.databricks.job.JobTriggerFileArrival ¤

Bases: BaseModel

Job Trigger File Arrival

ATTRIBUTE DESCRIPTION
url

URL of the job on the given workspace

TYPE: str

min_time_between_triggers_seconds

If set, the trigger starts a run only after the specified amount of time passed since the last time the trigger fired. The minimum allowed value is 60 seconds.

TYPE: int

wait_after_last_change_seconds

If set, the trigger starts a run only after no file activity has occurred for the specified amount of time. This makes it possible to wait for a batch of incoming files to arrive before triggering a run. The minimum allowed value is 60 seconds.

TYPE: int


laktory.models.resources.databricks.job.JobTrigger ¤

Bases: BaseModel

Job Trigger

ATTRIBUTE DESCRIPTION
file_arrival

File Arrival specifications

TYPE: JobTriggerFileArrival

pause_status

Indicate whether this trigger is paused or not. When the pause_status field is omitted in the block, the server will default to using UNPAUSED as a value for pause_status.

TYPE: Union[Literal['PAUSED', 'UNPAUSED'], str]


laktory.models.resources.databricks.job.JobWebhookNotificationsOnDurationWarningThresholdExceeded ¤

Bases: BaseModel

JobWebhook Notifications On Duration Warning Threshold specifications

ATTRIBUTE DESCRIPTION
id

Unique identifier

TYPE: str


laktory.models.resources.databricks.job.JobWebhookNotificationsOnFailure ¤

Bases: BaseModel

JobWebhook Notifications On Failure specifications

ATTRIBUTE DESCRIPTION
id

Unique identifier

TYPE: str


laktory.models.resources.databricks.job.JobWebhookNotificationsOnStart ¤

Bases: BaseModel

JobWebhook Notifications On Start specifications

ATTRIBUTE DESCRIPTION
id

Unique identifier

TYPE: str


laktory.models.resources.databricks.job.JobWebhookNotificationsOnSuccess ¤

Bases: BaseModel

JobWebhook Notifications On Success specifications

ATTRIBUTE DESCRIPTION
id

Unique identifier

TYPE: str


laktory.models.resources.databricks.job.JobWebhookNotifications ¤

Bases: BaseModel

Job Webhook Notifications specifications

ATTRIBUTE DESCRIPTION
on_duration_warning_threshold_exceededs

Warnings threshold exceeded specifications

TYPE: list[JobWebhookNotificationsOnDurationWarningThresholdExceeded]

on_failures

On failure specifications

TYPE: list[JobWebhookNotificationsOnFailure]

on_starts

On starts specifications

TYPE: list[JobWebhookNotificationsOnStart]

on_successes

On successes specifications

TYPE: list[JobWebhookNotificationsOnSuccess]

--

laktory.models.resources.databricks.job.JobLookup ¤

Bases: ResourceLookup

ATTRIBUTE DESCRIPTION
id

The id of the databricks job

TYPE: str