{ "cells": [ { "cell_type": "markdown", "id": "c0861835-b42d-404b-9dc2-cd161cd3a87a", "metadata": {}, "source": [ "# Calibrator Plugins\n", "\n", "The following tutorial demonstrates how one may distribute calibrators as plugins via `calisim's` extensible plugin system." ] }, { "cell_type": "code", "execution_count": 12, "id": "e007315e-eac3-4b2b-9e37-c39b2fc47213", "metadata": {}, "outputs": [], "source": [ "from calisim.data_model import (\n", "\tDistributionModel,\n", "\tParameterDataType,\n", "\tParameterSpecification,\n", ")\n", "from calisim.example_models import LotkaVolterraModel\n", "from calisim.optimisation import OptimisationMethod, OptimisationMethodModel\n", "from calisim.optimisation.optuna_wrapper import OptunaOptimisation as OptunaOptimisationBase\n", "from calisim.statistics import MeanSquaredError\n", "import optuna.samplers as opt_samplers\n", "from importlib.metadata import entry_points\n", "import numpy as np\n", "import pandas as pd\n", "\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "id": "44ee89cc-71c2-4111-8520-efa30af68ff6", "metadata": {}, "source": [ "## Registering Plugins\n", "\n", "First, let us suppose that you have extended the [Optuna](https://optuna.readthedocs.io/en/stable/) wrapper within `calisim` and created the `ExtendedOptunaOptimisation` class. For instance, let's add support for the `NSGAIIISampler` sampling algorithm within the `specify()` method." ] }, { "cell_type": "code", "execution_count": 13, "id": "27e4cead-6594-42aa-b0d4-fba743f1ca9a", "metadata": {}, "outputs": [], "source": [ "class ExtendedOptunaOptimisation(OptunaOptimisationBase):\n", "\n", " def specify(self) -> None:\n", " \"\"\"Specify the parameters of the model calibration procedure.\"\"\"\n", " sampler_name = self.specification.method\n", " supported_samplers = dict(\n", " tpes=opt_samplers.TPESampler,\n", " cmaes=opt_samplers.CmaEsSampler,\n", " nsgaii=opt_samplers.NSGAIISampler,\n", " nsgaiii=opt_samplers.NSGAIIISampler, # Adding support for NSGAIIISampler.\n", " qmc=opt_samplers.QMCSampler,\n", " gp=opt_samplers.GPSampler,\n", " )\n", " sampler_class = supported_samplers.get(sampler_name, None)\n", " if sampler_class is None:\n", " raise ValueError(\n", " f\"Unsupported Optuna sampler: {sampler_name}.\",\n", " f\"Supported Optuna samplers are {', '.join(supported_samplers)}\",\n", " )\n", " sampler_kwargs = self.specification.method_kwargs\n", " if sampler_kwargs is None:\n", " sampler_kwargs = {}\n", " self.sampler = sampler_class(**sampler_kwargs)\n", " \n", " self.study = optuna.create_study(\n", " sampler=self.sampler,\n", " study_name=self.specification.experiment_name,\n", " directions=self.specification.directions,\n", " storage=self.specification.storage,\n", " load_if_exists=True,\n", " )" ] }, { "cell_type": "markdown", "id": "8d8b0f64-32e5-49e0-93f6-2c043c751125", "metadata": {}, "source": [ "We can register `ExtendedOptunaOptimisation` as a new engine called `optuna_extended` under the `OptimisationMethod` calibration class. To do this, we must include `ExtendedOptunaOptimisation` as a plugin by modifying your `pyproject.toml` file.\n", "\n", "```pyproject.toml\n", "[project.entry-points.\"calisim.external.optimisation\"]\n", "optuna_extended = \"calisim.optimisation.optuna_wrapper:ExtendedOptunaOptimisation\"\n", "```\n", "\n", "If you are using Poetry, you may need to add the following within your `pyproject.toml` instead:\n", "\n", "```pyproject.toml\n", "[tool.poetry.plugins.\"calisim.external.optimisation\"]\n", "optuna_extended = \"calisim.optimisation.optuna_wrapper:ExtendedOptunaOptimisation\"\n", "```\n", "\n", "This assumes that the module path for `ExtendedOptunaOptimisation` is `calisim.optimisation.optuna_wrapper`. Naturally, the module path will vary depending on where your calibrator plugin is located. Let's check that `ExtendedOptunaOptimisation` was successfully added as a plugin." ] }, { "cell_type": "code", "execution_count": 14, "id": "6499892a-0cee-49ff-9519-fe7a922210e3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namemoduletargetgroup
0optuna_extendedcalisim.optimisation.optuna_wrapperExtendedOptunaOptimisationcalisim.external.optimisation
\n", "
" ], "text/plain": [ " name module \\\n", "0 optuna_extended calisim.optimisation.optuna_wrapper \n", "\n", " target group \n", "0 ExtendedOptunaOptimisation calisim.external.optimisation " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame([ \n", " { \n", " \"name\": entrypoint.name, \n", " \"module\": entrypoint.value.split(\":\")[0],\n", " \"target\": entrypoint.value.split(\":\")[1], \n", " \"group\": entrypoint.group \n", " }\n", " for entrypoint in entry_points().select(group=\"calisim.external.optimisation\") \n", "])" ] }, { "cell_type": "markdown", "id": "51eaf5b3-c247-44a7-902c-dfef1e4796d4", "metadata": {}, "source": [ "It looks like the `optuna_extended` plugin associated with the `ExtendedOptunaOptimisation` class was successfully added under the `calisim.external.optimisation` group. \n", "\n", "One can similarly add other plugins under different `calisim` modules. For instance, a plugin for the `sensitivity` module would be included under the `calisim.external.sensitivity` group. And so on.\n", "\n", "## Performing Calibration\n", "\n", "Let's run a calibration workflow using the `optuna_extended` engine. Let's try calibrating an example model: `LotkaVolterraModel`." ] }, { "cell_type": "code", "execution_count": 15, "id": "a11ed7e8-de7d-4196-a50e-a8446a875a21", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
yearlynxhare
01900.04.030.0
11901.06.147.2
21902.09.870.2
31903.035.277.4
41904.059.436.3
\n", "
" ], "text/plain": [ " year lynx hare\n", "0 1900.0 4.0 30.0\n", "1 1901.0 6.1 47.2\n", "2 1902.0 9.8 70.2\n", "3 1903.0 35.2 77.4\n", "4 1904.0 59.4 36.3" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = LotkaVolterraModel()\n", "observed_data = model.get_observed_data()\n", "observed_data.head(5)" ] }, { "cell_type": "markdown", "id": "90f89733-2e5c-4578-bf0d-dfd69e6ae07e", "metadata": {}, "source": [ "We'll define our parameter specification containing parameter names, data types, distribution types, and bounds." ] }, { "cell_type": "code", "execution_count": 16, "id": "1911ff4d-e475-43f7-8306-7cb945fb25b2", "metadata": {}, "outputs": [], "source": [ "parameter_spec = ParameterSpecification(\n", "\tparameters=[\n", "\t\tDistributionModel(\n", "\t\t\tname=\"alpha\",\n", "\t\t\tdistribution_name=\"uniform\",\n", "\t\t\tdistribution_args=[0.45, 0.55],\n", "\t\t\tdata_type=ParameterDataType.CONTINUOUS,\n", "\t\t),\n", "\t\tDistributionModel(\n", "\t\t\tname=\"beta\",\n", "\t\t\tdistribution_name=\"uniform\",\n", "\t\t\tdistribution_args=[0.02, 0.03],\n", "\t\t\tdata_type=ParameterDataType.CONTINUOUS,\n", "\t\t),\n", "\t]\n", ")" ] }, { "cell_type": "markdown", "id": "b589395b-b2a3-42e3-af70-74f0dd8856ae", "metadata": {}, "source": [ "We'll define our objective function. We aim to minimise the discrepancy between simulated and observed data using the `MeanSquaredError` metric as our loss." ] }, { "cell_type": "code", "execution_count": 17, "id": "2e307768-2eef-4dea-a997-f23adcb6e5d5", "metadata": {}, "outputs": [], "source": [ "def objective(\n", "\tparameters: dict, simulation_id: str, observed_data: np.ndarray | None, t: pd.Series\n", ") -> float | list[float]:\n", "\tsimulation_parameters = dict(h0=34.0, l0=5.9, t=t, gamma=0.84, delta=0.026)\n", "\n", "\tfor k in [\"alpha\", \"beta\"]:\n", "\t\tsimulation_parameters[k] = parameters[k]\n", "\n", "\tsimulated_data = model.simulate(simulation_parameters).lynx.values\n", "\tmetric = MeanSquaredError()\n", "\tdiscrepancy = metric.calculate(observed_data, simulated_data)\n", "\treturn discrepancy" ] }, { "cell_type": "markdown", "id": "73780471-6ebb-4cf7-ad75-20315d0f0d72", "metadata": {}, "source": [ "We'll define the specification for the `Optuna` optimisation procedure, making use of the `NSGAIIISampler` sampling algorithm. To do this, we'll set `method` to `nsgaiii`." ] }, { "cell_type": "code", "execution_count": 18, "id": "497b92b6-5b82-462a-86e7-adc05ab777c1", "metadata": {}, "outputs": [], "source": [ "specification = OptimisationMethodModel(\n", "\texperiment_name=\"optuna_extended_optimisation\",\n", "\tparameter_spec=parameter_spec,\n", "\tobserved_data=observed_data.lynx.values,\n", "\tmethod=\"nsgaiii\",\n", "\tdirections=[\"minimize\"],\n", "\tn_iterations=100,\n", "\tcalibration_func_kwargs=dict(t=observed_data.year),\n", ")" ] }, { "cell_type": "markdown", "id": "e7b79e37-31fe-4f70-9677-d136efdb9e8a", "metadata": {}, "source": [ "Finally, let's run the calibration workflow by instantiating an `OptimisationMethod` calibrator. Note that the `engine` will be set to `optuna_extended`. " ] }, { "cell_type": "code", "execution_count": 19, "id": "1eaffe8d-4baa-41f0-8942-0243ca404298", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "calisim.optimisation.optuna_wrapper.ExtendedOptunaOptimisation" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "calibrator = OptimisationMethod(\n", "\tcalibration_func=objective, specification=specification, engine=\"optuna_extended\"\n", ")\n", "type(calibrator.implementation)" ] }, { "cell_type": "markdown", "id": "0be12b52-8971-48e5-8404-7602fc3ad5e3", "metadata": {}, "source": [ "Finally, let's check that our calibrator is using the `NSGAIIISampler` sampling algorithm." ] }, { "cell_type": "code", "execution_count": 20, "id": "ff13ac25-2a58-48e2-bc69-7f07cf9c2204", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "optuna.samplers._nsgaiii._sampler.NSGAIIISampler" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "calibrator.specify()\n", "type(calibrator.implementation.sampler)" ] }, { "cell_type": "markdown", "id": "6a547813-2541-4009-83f2-a9ed6a06ab6a", "metadata": {}, "source": [ "Excellent. We can see that the `OptimisationMethod` calibration workflow is using the `optuna_extended` engine and `ExtendedOptunaOptimisation` class as defined by our newly installed plugin within the `calisim.external.optimisation` group." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.19" } }, "nbformat": 4, "nbformat_minor": 5 }