MemoryDataSource

laktory.models.datasources.MemoryDataSource ¤

Bases: BaseDataSource

Data source using in-memory DataFrame, generally used in the context of a data pipeline.

ATTRIBUTE	DESCRIPTION
`data`	Serialized data to build input DataFrame TYPE: `Union[dict[str, list[Any]], list[dict[str, Any]]]`
`df`	Input DataFrame TYPE: `Any`

Examples:

import polars as pl

from laktory import models

data = {
    "symbol": ["AAPL", "GOOGL"],
    "price": [200.0, 205.0],
    "tstamp": ["2023-09-01", "2023-09-01"],
}

# Spark from dict
source = models.MemoryDataSource(
    data=data,
    dataframe_backend="SPARK",
)
df = source.read(spark=spark)
print(df.laktory.show_string())
'''
+-----+------+----------+
|price|symbol|    tstamp|
+-----+------+----------+
|200.0|  AAPL|2023-09-01|
|205.0| GOOGL|2023-09-01|
+-----+------+----------+
'''

# Polars from df
source = models.MemoryDataSource(
    df=pl.DataFrame(data),
)
df = source.read()
print(df.to_pandas())
'''
  symbol  price      tstamp
0   AAPL  200.0  2023-09-01
1  GOOGL  205.0  2023-09-01
'''