Job
laktory.models.resources.databricks.Job
¤
Bases: BaseModel
, PulumiResource
, TerraformResource
Databricks Job
ATTRIBUTE | DESCRIPTION |
---|---|
access_controls |
Access Controls specifications
TYPE:
|
clusters |
A list of job databricks.Cluster specifications that can be shared and reused by tasks of this job. Libraries cannot be declared in a shared job cluster. You must declare dependent libraries in task settings.
TYPE:
|
continuous |
Continuous specifications
TYPE:
|
control_run_state |
If
TYPE:
|
description |
An optional description for the job. The maximum length is 1024 characters in UTF-8 encoding.
TYPE:
|
email_notifications |
An optional set of email addresses notified when runs of this job begins, completes or fails. The default behavior is to not send any emails. This field is a block and is documented below.
TYPE:
|
format |
TYPE:
|
health |
Health specifications
TYPE:
|
lookup_existing |
Specifications for looking up existing resource. Other attributes will be ignored.
TYPE:
|
max_concurrent_runs |
An optional maximum allowed number of concurrent runs of the job. Defaults to 1.
TYPE:
|
max_retries |
An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a FAILED or INTERNAL_ERROR lifecycle state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry. A run can have the following lifecycle state: PENDING, RUNNING, TERMINATING, TERMINATED, SKIPPED or INTERNAL_ERROR.
TYPE:
|
min_retry_interval_millis |
An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.
TYPE:
|
name |
Name of the job
TYPE:
|
name_prefix |
Prefix added to the job name
TYPE:
|
name_suffix |
Suffix added to the job name
TYPE:
|
notification_settings |
Notifications specifications
TYPE:
|
parameters |
Parameters specifications
TYPE:
|
retry_on_timeout |
An optional policy to specify whether to retry a job when it times out. The default behavior is to not retry on timeout.
TYPE:
|
run_as |
Run as specifications
TYPE:
|
schedule |
Schedule specifications
TYPE:
|
tags |
Tags as key, value pairs |
tasks |
Tasks specifications |
timeout_seconds |
An optional timeout applied to each run of this job. The default behavior is to have no timeout.
TYPE:
|
trigger |
Trigger specifications
TYPE:
|
webhook_notifications |
Webhook notifications specifications
TYPE:
|
Examples:
import io
from laktory import models
# Define job
job_yaml = '''
name: job-stock-prices
clusters:
- name: main
spark_version: 14.0.x-scala2.12
node_type_id: Standard_DS3_v2
tasks:
- task_key: ingest
job_cluster_key: main
notebook_task:
notebook_path: /jobs/ingest_stock_prices.py
libraries:
- pypi:
package: yfinance
- task_key: pipeline
depends_ons:
- task_key: ingest
pipeline_task:
pipeline_id: 74900655-3641-49f1-8323-b8507f0e3e3b
access_controls:
- group_name: account users
permission_level: CAN_VIEW
- group_name: role-engineers
permission_level: CAN_MANAGE_RUN
'''
job = models.resources.databricks.Job.model_validate_yaml(io.StringIO(job_yaml))
# Define job with for each task
job_yaml = '''
name: job-hello
tasks:
- task_key: hello-loop
for_each_task:
inputs:
- id: 1
name: olivier
- id: 2
name: kubic
task:
task_key: hello-task
notebook_task:
notebook_path: /Workspace/Users/olivier.soucy@okube.ai/hello-world
base_parameters:
input: "{{input}}"
'''
job = models.resources.databricks.Job.model_validate_yaml(io.StringIO(job_yaml))
References
ATTRIBUTE | DESCRIPTION |
---|---|
additional_core_resources |
TYPE:
|
laktory.models.resources.databricks.job.JobCluster
¤
Bases: Cluster
Job Cluster. Same attributes as laktory.models.Cluster
, except for
access_controls
is_pinned
libraries
no_wait
that are not allowed.
laktory.models.resources.databricks.job.JobContinuous
¤
Bases: BaseModel
Job Continuous specifications
ATTRIBUTE | DESCRIPTION |
---|---|
pause_status |
Indicate whether this continuous job is paused or not. When the pause_status field is omitted in the block,
the server will default to using |
laktory.models.resources.databricks.job.JobEmailNotifications
¤
Bases: BaseModel
Job Email Notifications specifications
ATTRIBUTE | DESCRIPTION |
---|---|
no_alert_for_skipped_runs |
If
TYPE:
|
on_duration_warning_threshold_exceededs |
List of emails to notify when the duration of a run exceeds the threshold specified by the RUN_DURATION_SECONDS metric in the health block. |
on_failures |
List of emails to notify when the run fails. |
on_starts |
List of emails to notify when the run starts. |
on_successes |
List of emails to notify when the run completes successfully. |
laktory.models.resources.databricks.job.JobHealthRule
¤
Bases: BaseModel
Job Health Rule specifications
ATTRIBUTE | DESCRIPTION |
---|---|
metric |
Metric to check. The only supported metric is RUN_DURATION_SECONDS (check Jobs REST API documentation for the latest information).
TYPE:
|
op |
Operation used to compare operands. Currently, following operators are supported: EQUAL_TO, GREATER_THAN, GREATER_THAN_OR_EQUAL, LESS_THAN, LESS_THAN_OR_EQUAL, NOT_EQUAL.
TYPE:
|
value |
Value used to compare to the given metric.
TYPE:
|
laktory.models.resources.databricks.job.JobHealth
¤
Bases: BaseModel
Job Health specifications
ATTRIBUTE | DESCRIPTION |
---|---|
rules |
Job health rules specifications
TYPE:
|
laktory.models.resources.databricks.job.JobNotificationSettings
¤
laktory.models.resources.databricks.job.JobParameter
¤
laktory.models.resources.databricks.job.JobRunAs
¤
Bases: BaseModel
Job Parameter specifications
ATTRIBUTE | DESCRIPTION |
---|---|
service_principal_name |
The application ID of an active service principal. Setting this field requires the servicePrincipal/user role.
TYPE:
|
user_name |
The email of an active workspace user. Non-admin users can only set this field to their own email.
TYPE:
|
laktory.models.resources.databricks.job.JobSchedule
¤
Bases: BaseModel
Job Schedule specifications
ATTRIBUTE | DESCRIPTION |
---|---|
quartz_cron_expression |
A Cron expression using Quartz syntax that describes the schedule for a job. This field is required.
TYPE:
|
timezone_id |
A Java timezone ID. The schedule for a job will be resolved with respect to this timezone. See Java TimeZone for details. This field is required.
TYPE:
|
pause_status |
Indicate whether this schedule is paused or not. When the pause_status field is omitted and a schedule is
provided, the server will default to using |
laktory.models.resources.databricks.job.JobTaskConditionTask
¤
Bases: BaseModel
Job Task Condition Task specifications
ATTRIBUTE | DESCRIPTION |
---|---|
left |
The left operand of the condition task. It could be a string value, job state, or a parameter reference.
TYPE:
|
op |
The string specifying the operation used to compare operands. This task does not require a cluster to execute and does not support retries or notifications.
TYPE:
|
right |
The right operand of the condition task. It could be a string value, job state, or parameter reference.
TYPE:
|
laktory.models.resources.databricks.job.JobTaskDependsOn
¤
Bases: BaseModel
Job Task Depends On specifications
ATTRIBUTE | DESCRIPTION |
---|---|
task_key |
The name of the task this task depends on.
TYPE:
|
outcome |
Can only be specified on condition task dependencies. The outcome of the dependent task that must be met for this task to run.
TYPE:
|
laktory.models.resources.databricks.job.JobTaskNotebookTask
¤
Bases: BaseModel
Job Task Notebook Task specifications
ATTRIBUTE | DESCRIPTION |
---|---|
notebook_path |
The path of the databricks.Notebook to be run in the Databricks workspace or remote repository. For notebooks stored in the Databricks workspace, the path must be absolute and begin with a slash. For notebooks stored in a remote repository, the path must be relative.
TYPE:
|
base_parameters |
Base parameters to be used for each run of this job. If the run is initiated by a call to run-now with parameters specified, the two parameters maps will be merged. If the same key is specified in base_parameters and in run-now, the value from run-now will be used. If the notebook takes a parameter that is not specified in the job’s base_parameters or the run-now override parameters, the default value from the notebook will be used. Retrieve these parameters in a notebook using dbutils.widgets.get. |
warehouse_id |
The id of the SQL warehouse to execute this task. If a warehouse_id is specified, that SQL warehouse will be used to execute SQL commands inside the specified notebook.
TYPE:
|
source |
Location type of the notebook, can only be WORKSPACE or GIT. When set to WORKSPACE, the notebook will be retrieved from the local Databricks workspace. When set to GIT, the notebook will be retrieved from a Git repository defined in git_source. If the value is empty, the task will use GIT if git_source is defined and WORKSPACE otherwise.
TYPE:
|
laktory.models.resources.databricks.job.JobTaskPipelineTask
¤
laktory.models.resources.databricks.job.JobTaskRunJobTask
¤
laktory.models.resources.databricks.job.JobTaskSqlTaskQuery
¤
laktory.models.resources.databricks.job.JobTaskSqlTaskAlertSubscription
¤
laktory.models.resources.databricks.job.JobTaskSQLTaskAlert
¤
Bases: BaseModel
Job Task SQL Task Alert specifications
ATTRIBUTE | DESCRIPTION |
---|---|
alert_id |
Identifier of the Databricks SQL Alert.
TYPE:
|
subscriptions |
A list of subscription blocks consisting out of one of the required fields: |
pause_subscriptions |
It
TYPE:
|
laktory.models.resources.databricks.job.JobTaskSqlTaskDashboard
¤
laktory.models.resources.databricks.job.JobTaskSqlTaskFile
¤
Bases: BaseModel
Job Task SQL Task File specifications
ATTRIBUTE | DESCRIPTION |
---|---|
path |
If source is
TYPE:
|
source |
The source of the project. Possible values are
TYPE:
|
laktory.models.resources.databricks.job.JobTaskSQLTask
¤
Bases: BaseModel
Job Task SQL Task specifications
ATTRIBUTE | DESCRIPTION |
---|---|
alert |
Alert specifications
TYPE:
|
dashboard |
Dashboard specifications
TYPE:
|
file |
File specifications
TYPE:
|
parameters |
Parameters specifications |
query |
Query specifications
TYPE:
|
warehouse_id |
Warehouse id
TYPE:
|
laktory.models.resources.databricks.job.JobTaskSQLTask
¤
Bases: BaseModel
Job Task SQL Task specifications
ATTRIBUTE | DESCRIPTION |
---|---|
alert |
Alert specifications
TYPE:
|
dashboard |
Dashboard specifications
TYPE:
|
file |
File specifications
TYPE:
|
parameters |
Parameters specifications |
query |
Query specifications
TYPE:
|
warehouse_id |
Warehouse id
TYPE:
|
laktory.models.resources.databricks.job.JobTaskForEachTask
¤
Bases: BaseModel
For Each Task specifications
ATTRIBUTE | DESCRIPTION |
---|---|
inputs |
Array for task to iterate on. This can be a JSON string or a reference to an array parameter. Laktory also supports a list input, which wil be serialized. |
task |
Task to run against the inputs list.
TYPE:
|
concurrency |
Controls the number of active iteration task runs. Default is 20, maximum allowed is 100.
TYPE:
|
laktory.models.resources.databricks.job.JobTaskForEachTaskTask
¤
Bases: BaseModel
Job Task specifications
ATTRIBUTE | DESCRIPTION |
---|---|
condition_task |
Condition Task specifications
TYPE:
|
depends_ons |
Depends On specifications
TYPE:
|
description |
specifications
TYPE:
|
email_notifications |
Email Notifications specifications
TYPE:
|
existing_cluster_id |
Cluster id from one of the clusters available in the workspace
TYPE:
|
health |
Job Health specifications
TYPE:
|
job_cluster_key |
Identifier that can be referenced in task block, so that cluster is shared between tasks
TYPE:
|
libraries |
Cluster Library specifications
TYPE:
|
max_retries |
An optional maximum number of times to retry an unsuccessful run.
TYPE:
|
min_retry_interval_millis |
An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.
TYPE:
|
notebook_task |
Notebook Task specifications
TYPE:
|
notification_settings |
Notification Settings specifications
TYPE:
|
pipeline_task |
Pipeline Task specifications
TYPE:
|
retry_on_timeout |
If
TYPE:
|
run_if |
An optional value indicating the condition that determines whether the task should be run once its dependencies
have been completed. When omitted, defaults to
TYPE:
|
run_job_task |
Run Job specifications
TYPE:
|
sql_task |
SQL Task specifications
TYPE:
|
task_key |
A unique key for a given task.
TYPE:
|
timeout_seconds |
An optional timeout applied to each run of this job. The default behavior is to have no timeout.
TYPE:
|
laktory.models.resources.databricks.job.JobTask
¤
Bases: JobTaskForEachTaskTask
Job Task specifications
ATTRIBUTE | DESCRIPTION |
---|---|
for_each_task |
For each task configuration
TYPE:
|
laktory.models.resources.databricks.job.JobTriggerFileArrival
¤
Bases: BaseModel
Job Trigger File Arrival
ATTRIBUTE | DESCRIPTION |
---|---|
url |
URL of the job on the given workspace
TYPE:
|
min_time_between_triggers_seconds |
If set, the trigger starts a run only after the specified amount of time passed since the last time the trigger fired. The minimum allowed value is 60 seconds.
TYPE:
|
wait_after_last_change_seconds |
If set, the trigger starts a run only after no file activity has occurred for the specified amount of time. This makes it possible to wait for a batch of incoming files to arrive before triggering a run. The minimum allowed value is 60 seconds.
TYPE:
|
laktory.models.resources.databricks.job.JobTrigger
¤
Bases: BaseModel
Job Trigger
ATTRIBUTE | DESCRIPTION |
---|---|
file_arrival |
File Arrival specifications
TYPE:
|
pause_status |
Indicate whether this trigger is paused or not. When the pause_status field is omitted in the block, the server
will default to using |
laktory.models.resources.databricks.job.JobWebhookNotificationsOnDurationWarningThresholdExceeded
¤
laktory.models.resources.databricks.job.JobWebhookNotificationsOnFailure
¤
laktory.models.resources.databricks.job.JobWebhookNotificationsOnStart
¤
laktory.models.resources.databricks.job.JobWebhookNotificationsOnSuccess
¤
laktory.models.resources.databricks.job.JobWebhookNotifications
¤
Bases: BaseModel
Job Webhook Notifications specifications
ATTRIBUTE | DESCRIPTION |
---|---|
on_duration_warning_threshold_exceededs |
Warnings threshold exceeded specifications
TYPE:
|
on_failures |
On failure specifications |
on_starts |
On starts specifications |
on_successes |
On successes specifications |
--