Kedro Snowflake data classes
kedro-snowflake
natively supports Kedro’s official SnowparkTableDataSet
and adds a few new classes to make it easier to use Snowflake with Kedro.
Both of these can be found under the kedro_snowflake.datasets.native
module.
For details on usage, see the API Reference below.
API Reference
- class kedro_snowflake.datasets.native.SnowflakeStageFileDataSet(stage: str, filepath: str, dataset: str | dict, filepath_arg: str = 'filepath', database: str | None = None, schema: str | None = None, credentials: Dict[str, Any] | None = None)
Dataset providing an integration with most of the standard Kedro file-based datasets. It allows to store/load data from Snowflake stage for any underlying dataset, e.g. pandas.CSVDataSet etc.
Args
-stage
: Name of the Snowflake stage. Must start with@
.-filepath
: Path to the file in the Snowflake stage.-dataset
: a dictionary for configuring the underlying dataset.It can be either a string with only dataset name (e.g. “pandas.CSVDataSet”)or a dictionary with the same structure as you would use in the Kedro catalog.yml.-filepath_arg
: Name of the argument in the underlying dataset that accepts the filepath (default is filepath). # noqa-database
: Name of the Snowflake database. If not specified, will attempt to load from the credentials.-schema
: Name of the Snowflake schema. If not specified, will attempt to load from the credentials.-credentials
: Credentials to use to load/save data from Snowflake. Can be used instead of schema/database # noqa in the same fashion as in thekedro_datasets.snowflake.snowpark_dataset.SnowparkTableDataSet
.Example
Example of a catalog.yml entry:
preprocessed_shuttles: type: kedro_snowflake.datasets.native.SnowflakeStageFileDataSet stage: "@KEDRO_SNOWFLAKE_TEMP_DATA_STAGE" filepath: data/02_intermediate/preprocessed_shuttles.csv credentials: snowflake dataset: type: pandas.CSVDataSet