MemoryDataSource
laktory.models.datasources.MemoryDataSource
ยค
Bases: BaseDataSource
Data source using in-memory DataFrame, generally used in the context of a data pipeline.
ATTRIBUTE | DESCRIPTION |
---|---|
data |
Serialized data to build input DataFrame |
df |
Input DataFrame
TYPE:
|
Examples:
import polars as pl
from laktory import models
data = {
"symbol": ["AAPL", "GOOGL"],
"price": [200.0, 205.0],
"tstamp": ["2023-09-01", "2023-09-01"],
}
# Spark from dict
source = models.MemoryDataSource(
data=data,
dataframe_backend="SPARK",
)
df = source.read(spark=spark)
print(df.laktory.show_string())
'''
+-----+------+----------+
|price|symbol| tstamp|
+-----+------+----------+
|200.0| AAPL|2023-09-01|
|205.0| GOOGL|2023-09-01|
+-----+------+----------+
'''
# Polars from df
source = models.MemoryDataSource(
df=pl.DataFrame(data),
)
df = source.read()
print(df.to_pandas())
'''
symbol price tstamp
0 AAPL 200.0 2023-09-01
1 GOOGL 205.0 2023-09-01
'''