Skip to content

Variables

When declaring models in Laktory, it's not always practical, desirable, or even possible to hardcode certain properties. For instance, in a pipeline declaration, the catalog name might depend on the environment where the pipeline is deployed.

name: my-job-dev
tasks:
  - task_key: pipeline
    pipeline_task:
      pipeline_id: 7dee2833-8287-4d50-8985-4a418e14f8c3

...
In many cases, you’ll want to reuse the same configuration file across different environments, but with varying values for the task key and pipeline ID. Laktory introduces model variables to solve this problem.

User-defined variables¤

Declaration¤

User-defined variables, or vars, are declared in a model using the ${vars.variable_name} syntax.

We can re-write the job declaration by replacing the explicit environment expression dev by a variable named env and expressed as ${vars.env}

name: my-job-${vars.env}
tasks:
  - task_key: pipeline
    pipeline_task:
      pipeline_id: 7dee2833-8287-4d50-8985-4a418e14f8c3

...

Value Definition¤

There are several ways to define the value of these variables.

Environment variable¤

Laktory will search for a corresponding environment variable named VARIABLE_NAME, such as ENV in this case.

Model variables¤

Once a model is instantiated, variables can be directly assigned to the model:

main.py
from laktory import models

with open("my-job.yaml", "r") as fp:
    job = models.Job.model_validate_yaml(fp)

job.variables = {"env": "dev"}

Stack variables¤

If the model is part of a stack, variables can be defined at the stack level:

main.py
from laktory import models

with open("my-job.yaml", "r") as fp:
    job = models.Job.model_validate_yaml(fp)

stack = models.Stack(
    name="my-stack", resources={"jobs": {"my-job": job}}, variables={"env": "dev"}
)

Pipeline Nodes Variables¤

When using SQL statements to define transformations for a DataFrame, it’s often necessary to reference the output of other pipeline nodes or the output of the previous transformer node. Laktory supports this by using {df} to reference the previous node’s DataFrame and {nodes.node_name} to reference the DataFrame from other specific pipeline nodes.

Here is an example:

SELECT 
    * 
FROM 
    {df} 
UNION 
    SELECT * FROM {nodes.node_01} 
UNION 
    SELECT * FROM {nodes.node_02}"

Resources variables¤

What if the value of a variable needs to be the output of another deployed resource? Laktory supports resource variables to address this need.

Declaration¤

Resource variables are declared using the notation ${resources.resource_name.resource_output}. For instance, our earlier job example could dynamically reference a deployed pipeline:

name: my-job-${vars.env}
tasks:
  - task_key: pipeline
    pipeline_task:
      pipeline_id: ${resources.my-pipeline.id}
...
Here, the static pipeline ID is replaced by a dynamic reference to the pipeline resource my-pipeline.

Value Definition¤

Unlike user-defined variables, values for resource variables are populated automatically by Laktory. The available outputs correspond to those generated by the selected Infrastructure-as-Code backend (Pulumi or Terraform). For resource x to be available, it must be deployed as part of the current stack.

Variable injection¤

API Documentation

laktory.models.BaseModel.inject_vars

Variable values are typically injected during deployment, just after serialization (model_dump). However, injection can also be triggered manually by invoking the job.inject_vars() method.