Variables
When declaring models in Laktory, it's not always practical, desirable, or even possible to hardcode certain properties. For instance, in a pipeline declaration, the catalog name might depend on the environment where the pipeline is deployed.
name: my-job-dev
tasks:
- task_key: pipeline
pipeline_task:
pipeline_id: 7dee2833-8287-4d50-8985-4a418e14f8c3
...
User-defined variables¤
Declaration¤
User-defined variables, or vars
, are declared in a model using the ${vars.variable_name}
syntax.
We can re-write the job declaration by replacing the explicit environment expression dev
by a variable named env
and expressed as ${vars.env}
name: my-job-${vars.env}
tasks:
- task_key: pipeline
pipeline_task:
pipeline_id: 7dee2833-8287-4d50-8985-4a418e14f8c3
...
Value Definition¤
There are several ways to define the value of these variables.
Environment variable¤
Laktory will search for a corresponding environment variable named VARIABLE_NAME
, such as ENV
in this case.
Model variables¤
Once a model is instantiated, variables can be directly assigned to the model:
from laktory import models
with open("my-job.yaml", "r") as fp:
job = models.Job.model_validate_yaml(fp)
job.variables = {"env": "dev"}
Stack variables¤
If the model is part of a stack, variables can be defined at the stack level:
from laktory import models
with open("my-job.yaml", "r") as fp:
job = models.Job.model_validate_yaml(fp)
stack = models.Stack(
name="my-stack", resources={"jobs": {"my-job": job}}, variables={"env": "dev"}
)
Pipeline Nodes Variables¤
When using SQL statements to define transformations for a DataFrame, it’s often necessary to reference the output of other pipeline nodes or the output of the previous transformer node. Laktory supports this by using {df} to reference the previous node’s DataFrame and {nodes.node_name} to reference the DataFrame from other specific pipeline nodes.
Here is an example:
SELECT
*
FROM
{df}
UNION
SELECT * FROM {nodes.node_01}
UNION
SELECT * FROM {nodes.node_02}"
Resources variables¤
What if the value of a variable needs to be the output of another deployed resource? Laktory supports resource variables to address this need.
Declaration¤
Resource variables are declared using the notation ${resources.resource_name.resource_output}
. For instance, our
earlier job example could dynamically reference a deployed pipeline:
name: my-job-${vars.env}
tasks:
- task_key: pipeline
pipeline_task:
pipeline_id: ${resources.my-pipeline.id}
...
my-pipeline
.
Value Definition¤
Unlike user-defined variables, values for resource variables are populated automatically by Laktory. The available
outputs correspond to those generated by the selected Infrastructure-as-Code backend (Pulumi or Terraform). For
resource x
to be available, it must be deployed as part of the current stack.
Variable injection¤
API Documentation
Variable values are typically injected during deployment, just after serialization (model_dump
). However, injection can
also be triggered manually by invoking the job.inject_vars()
method.