Kedro Snowflake data classes

kedro-snowflake natively supports Kedro’s official SnowparkTableDataSet and adds a few new classes to make it easier to use Snowflake with Kedro.

Both of these can be found under the kedro_snowflake.datasets.native module.

For details on usage, see the API Reference below.

API Reference

class kedro_snowflake.datasets.native.SnowflakeStageFileDataSet(stage: str, filepath: str, dataset: str | dict, filepath_arg: str = 'filepath', database: str | None = None, schema: str | None = None, credentials: Dict[str, Any] | None = None)

Dataset providing an integration with most of the standard Kedro file-based datasets. It allows to store/load data from Snowflake stage for any underlying dataset, e.g. pandas.CSVDataSet etc.

Args

- stage: Name of the Snowflake stage. Must start with @.
- filepath: Path to the file in the Snowflake stage.
- dataset: a dictionary for configuring the underlying dataset.
It can be either a string with only dataset name (e.g. “pandas.CSVDataSet”)
or a dictionary with the same structure as you would use in the Kedro catalog.yml.
- filepath_arg: Name of the argument in the underlying dataset that accepts the filepath (default is filepath). # noqa
- database: Name of the Snowflake database. If not specified, will attempt to load from the credentials.
- schema: Name of the Snowflake schema. If not specified, will attempt to load from the credentials.
- credentials: Credentials to use to load/save data from Snowflake. Can be used instead of schema/database # noqa in the same fashion as in the kedro_datasets.snowflake.snowpark_dataset.SnowparkTableDataSet.

Example

Example of a catalog.yml entry:

preprocessed_shuttles:
  type: kedro_snowflake.datasets.native.SnowflakeStageFileDataSet
  stage: "@KEDRO_SNOWFLAKE_TEMP_DATA_STAGE"
  filepath: data/02_intermediate/preprocessed_shuttles.csv
  credentials: snowflake
  dataset:
    type: pandas.CSVDataSet