DLT Pipeline
laktory.models.resources.databricks.DLTPipeline
¤
Bases: BaseModel
, PulumiResource
, TerraformResource
Databricks Delta Live Tables (DLT) Pipeline
ATTRIBUTE | DESCRIPTION |
---|---|
access_controls |
Pipeline access controls
TYPE:
|
allow_duplicate_names |
If
TYPE:
|
catalog |
Name of the unity catalog storing the pipeline tables |
channel |
Name of the release channel for Spark version used by DLT pipeline.
TYPE:
|
clusters |
Clusters to run the pipeline. If none is specified, pipelines will automatically select a default cluster configuration for the pipeline.
TYPE:
|
configuration |
List of values to apply to the entire pipeline. Elements must be formatted as key:value pairs |
continuous |
If
TYPE:
|
development |
If |
edition |
Name of the product edition
TYPE:
|
libraries |
Specifies pipeline code (notebooks) and required artifacts.
TYPE:
|
name |
Pipeline name
TYPE:
|
name_prefix |
Prefix added to the DLT pipeline name
TYPE:
|
name_suffix |
Suffix added to the DLT pipeline name
TYPE:
|
notifications |
Notifications specifications
TYPE:
|
photon |
If
TYPE:
|
serverless |
If
TYPE:
|
storage |
A location on DBFS or cloud storage where output data and metadata required for pipeline execution are stored.
By default, tables are stored in a subdirectory of this location. Change of this parameter forces recreation
of the pipeline. (Conflicts with
TYPE:
|
target |
The name of a database (in either the Hive metastore or in a UC catalog) for persisting pipeline output data. Configuring the target setting allows you to view and query the pipeline output data from the Databricks UI.
TYPE:
|
Examples:
Assuming the configuration yaml file
import io
from laktory import models
# Define pipeline
pipeline_yaml = '''
name: pl-stock-prices
catalog: dev
target: finance
clusters:
- name : default
node_type_id: Standard_DS3_v2
autoscale:
min_workers: 1
max_workers: 2
libraries:
- notebook:
path: /pipelines/dlt_brz_template.py
- notebook:
path: /pipelines/dlt_slv_template.py
- notebook:
path: /pipelines/dlt_gld_stock_performances.py
access_controls:
- group_name: account users
permission_level: CAN_VIEW
- group_name: role-engineers
permission_level: CAN_RUN
'''
pipeline = models.resources.databricks.DLTPipeline.model_validate_yaml(
io.StringIO(pipeline_yaml)
)
References
ATTRIBUTE | DESCRIPTION |
---|---|
resource_type_id |
dlt
TYPE:
|
additional_core_resources |
TYPE:
|
laktory.models.resources.databricks.dltpipeline.PipelineLibraryFile
¤
laktory.models.resources.databricks.dltpipeline.PipelineLibraryNotebook
¤
laktory.models.resources.databricks.dltpipeline.PipelineLibrary
¤
Bases: BaseModel
Pipeline Library specifications
ATTRIBUTE | DESCRIPTION |
---|---|
file |
File specifications
TYPE:
|
notebook |
Notebook specifications
TYPE:
|
laktory.models.resources.databricks.dltpipeline.PipelineNotifications
¤
laktory.models.resources.databricks.dltpipeline.PipelineCluster
¤
Bases: Cluster
Pipeline Cluster. Same attributes as laktory.models.Cluster
, except for
autotermination_minutes
cluster_id
data_security_mode
enable_elastic_disk
idempotency_token
is_pinned
libraries
no_wait
node_type_id
runtime_engine
single_user_name
spark_version
that are not allowed.