{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c0861835-b42d-404b-9dc2-cd161cd3a87a",
   "metadata": {},
   "source": [
    "# Calibrator Plugins\n",
    "\n",
    "The following tutorial demonstrates how one may distribute calibrators as plugins via `calisim's` extensible plugin system."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "e007315e-eac3-4b2b-9e37-c39b2fc47213",
   "metadata": {},
   "outputs": [],
   "source": [
    "from calisim.data_model import (\n",
    "\tDistributionModel,\n",
    "\tParameterDataType,\n",
    "\tParameterSpecification,\n",
    ")\n",
    "from calisim.example_models import LotkaVolterraModel\n",
    "from calisim.optimisation import OptimisationMethod, OptimisationMethodModel\n",
    "from calisim.optimisation.optuna_wrapper import OptunaOptimisation as OptunaOptimisationBase\n",
    "from calisim.statistics import MeanSquaredError\n",
    "import optuna.samplers as opt_samplers\n",
    "from importlib.metadata import entry_points\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44ee89cc-71c2-4111-8520-efa30af68ff6",
   "metadata": {},
   "source": [
    "## Registering Plugins\n",
    "\n",
    "First, let us suppose that you have extended the [Optuna](https://optuna.readthedocs.io/en/stable/) wrapper within `calisim` and created the `ExtendedOptunaOptimisation` class. For instance, let's add support for the `NSGAIIISampler` sampling algorithm within the `specify()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "27e4cead-6594-42aa-b0d4-fba743f1ca9a",
   "metadata": {},
   "outputs": [],
   "source": [
    "class ExtendedOptunaOptimisation(OptunaOptimisationBase):\n",
    "\n",
    "    def specify(self) -> None:\n",
    "        \"\"\"Specify the parameters of the model calibration procedure.\"\"\"\n",
    "        sampler_name = self.specification.method\n",
    "        supported_samplers = dict(\n",
    "            tpes=opt_samplers.TPESampler,\n",
    "            cmaes=opt_samplers.CmaEsSampler,\n",
    "            nsgaii=opt_samplers.NSGAIISampler,\n",
    "            nsgaiii=opt_samplers.NSGAIIISampler, # Adding support for NSGAIIISampler.\n",
    "            qmc=opt_samplers.QMCSampler,\n",
    "            gp=opt_samplers.GPSampler,\n",
    "        )\n",
    "        sampler_class = supported_samplers.get(sampler_name, None)\n",
    "        if sampler_class is None:\n",
    "            raise ValueError(\n",
    "                f\"Unsupported Optuna sampler: {sampler_name}.\",\n",
    "                f\"Supported Optuna samplers are {', '.join(supported_samplers)}\",\n",
    "            )\n",
    "        sampler_kwargs = self.specification.method_kwargs\n",
    "        if sampler_kwargs is None:\n",
    "            sampler_kwargs = {}\n",
    "        self.sampler = sampler_class(**sampler_kwargs)\n",
    "        \n",
    "        self.study = optuna.create_study(\n",
    "            sampler=self.sampler,\n",
    "            study_name=self.specification.experiment_name,\n",
    "            directions=self.specification.directions,\n",
    "            storage=self.specification.storage,\n",
    "            load_if_exists=True,\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d8b0f64-32e5-49e0-93f6-2c043c751125",
   "metadata": {},
   "source": [
    "We can register `ExtendedOptunaOptimisation` as a new engine called `optuna_extended` under the `OptimisationMethod` calibration class. To do this, we must include `ExtendedOptunaOptimisation` as a plugin by modifying your `pyproject.toml` file.\n",
    "\n",
    "```pyproject.toml\n",
    "[project.entry-points.\"calisim.external.optimisation\"]\n",
    "optuna_extended = \"calisim.optimisation.optuna_wrapper:ExtendedOptunaOptimisation\"\n",
    "```\n",
    "\n",
    "If you are using Poetry, you may need to add the following within your `pyproject.toml` instead:\n",
    "\n",
    "```pyproject.toml\n",
    "[tool.poetry.plugins.\"calisim.external.optimisation\"]\n",
    "optuna_extended = \"calisim.optimisation.optuna_wrapper:ExtendedOptunaOptimisation\"\n",
    "```\n",
    "\n",
    "This assumes that the module path for `ExtendedOptunaOptimisation` is `calisim.optimisation.optuna_wrapper`. Naturally, the module path will vary depending on where your calibrator plugin is located. Let's check that `ExtendedOptunaOptimisation` was successfully added as a plugin."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "6499892a-0cee-49ff-9519-fe7a922210e3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>module</th>\n",
       "      <th>target</th>\n",
       "      <th>group</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>optuna_extended</td>\n",
       "      <td>calisim.optimisation.optuna_wrapper</td>\n",
       "      <td>ExtendedOptunaOptimisation</td>\n",
       "      <td>calisim.external.optimisation</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              name                               module  \\\n",
       "0  optuna_extended  calisim.optimisation.optuna_wrapper   \n",
       "\n",
       "                       target                          group  \n",
       "0  ExtendedOptunaOptimisation  calisim.external.optimisation  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame([ \n",
    "    { \n",
    "        \"name\": entrypoint.name, \n",
    "        \"module\": entrypoint.value.split(\":\")[0],\n",
    "        \"target\": entrypoint.value.split(\":\")[1],  \n",
    "        \"group\": entrypoint.group \n",
    "    }\n",
    "    for entrypoint in entry_points().select(group=\"calisim.external.optimisation\") \n",
    "])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51eaf5b3-c247-44a7-902c-dfef1e4796d4",
   "metadata": {},
   "source": [
    "It looks like the `optuna_extended` plugin associated with the `ExtendedOptunaOptimisation` class was successfully added under the `calisim.external.optimisation` group. \n",
    "\n",
    "One can similarly add other plugins under different `calisim` modules. For instance, a plugin for the `sensitivity` module would be included under the `calisim.external.sensitivity` group. And so on.\n",
    "\n",
    "## Performing Calibration\n",
    "\n",
    "Let's run a calibration workflow using the `optuna_extended` engine. Let's try calibrating an example model: `LotkaVolterraModel`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "a11ed7e8-de7d-4196-a50e-a8446a875a21",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>year</th>\n",
       "      <th>lynx</th>\n",
       "      <th>hare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1900.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>30.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1901.0</td>\n",
       "      <td>6.1</td>\n",
       "      <td>47.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1902.0</td>\n",
       "      <td>9.8</td>\n",
       "      <td>70.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1903.0</td>\n",
       "      <td>35.2</td>\n",
       "      <td>77.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1904.0</td>\n",
       "      <td>59.4</td>\n",
       "      <td>36.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     year  lynx  hare\n",
       "0  1900.0   4.0  30.0\n",
       "1  1901.0   6.1  47.2\n",
       "2  1902.0   9.8  70.2\n",
       "3  1903.0  35.2  77.4\n",
       "4  1904.0  59.4  36.3"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model = LotkaVolterraModel()\n",
    "observed_data = model.get_observed_data()\n",
    "observed_data.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "90f89733-2e5c-4578-bf0d-dfd69e6ae07e",
   "metadata": {},
   "source": [
    "We'll define our parameter specification containing parameter names, data types, distribution types, and bounds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "1911ff4d-e475-43f7-8306-7cb945fb25b2",
   "metadata": {},
   "outputs": [],
   "source": [
    "parameter_spec = ParameterSpecification(\n",
    "\tparameters=[\n",
    "\t\tDistributionModel(\n",
    "\t\t\tname=\"alpha\",\n",
    "\t\t\tdistribution_name=\"uniform\",\n",
    "\t\t\tdistribution_args=[0.45, 0.55],\n",
    "\t\t\tdata_type=ParameterDataType.CONTINUOUS,\n",
    "\t\t),\n",
    "\t\tDistributionModel(\n",
    "\t\t\tname=\"beta\",\n",
    "\t\t\tdistribution_name=\"uniform\",\n",
    "\t\t\tdistribution_args=[0.02, 0.03],\n",
    "\t\t\tdata_type=ParameterDataType.CONTINUOUS,\n",
    "\t\t),\n",
    "\t]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b589395b-b2a3-42e3-af70-74f0dd8856ae",
   "metadata": {},
   "source": [
    "We'll define our objective function. We aim to minimise the discrepancy between simulated and observed data using the `MeanSquaredError` metric as our loss."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "2e307768-2eef-4dea-a997-f23adcb6e5d5",
   "metadata": {},
   "outputs": [],
   "source": [
    "def objective(\n",
    "\tparameters: dict, simulation_id: str, observed_data: np.ndarray | None, t: pd.Series\n",
    ") -> float | list[float]:\n",
    "\tsimulation_parameters = dict(h0=34.0, l0=5.9, t=t, gamma=0.84, delta=0.026)\n",
    "\n",
    "\tfor k in [\"alpha\", \"beta\"]:\n",
    "\t\tsimulation_parameters[k] = parameters[k]\n",
    "\n",
    "\tsimulated_data = model.simulate(simulation_parameters).lynx.values\n",
    "\tmetric = MeanSquaredError()\n",
    "\tdiscrepancy = metric.calculate(observed_data, simulated_data)\n",
    "\treturn discrepancy"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73780471-6ebb-4cf7-ad75-20315d0f0d72",
   "metadata": {},
   "source": [
    "We'll define the specification for the `Optuna` optimisation procedure, making use of the `NSGAIIISampler` sampling algorithm. To do this, we'll set `method` to `nsgaiii`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "497b92b6-5b82-462a-86e7-adc05ab777c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "specification = OptimisationMethodModel(\n",
    "\texperiment_name=\"optuna_extended_optimisation\",\n",
    "\tparameter_spec=parameter_spec,\n",
    "\tobserved_data=observed_data.lynx.values,\n",
    "\tmethod=\"nsgaiii\",\n",
    "\tdirections=[\"minimize\"],\n",
    "\tn_iterations=100,\n",
    "\tcalibration_func_kwargs=dict(t=observed_data.year),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7b79e37-31fe-4f70-9677-d136efdb9e8a",
   "metadata": {},
   "source": [
    "Finally, let's run the calibration workflow by instantiating an `OptimisationMethod` calibrator. Note that the `engine` will be set to `optuna_extended`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "1eaffe8d-4baa-41f0-8942-0243ca404298",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "calisim.optimisation.optuna_wrapper.ExtendedOptunaOptimisation"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "calibrator = OptimisationMethod(\n",
    "\tcalibration_func=objective, specification=specification, engine=\"optuna_extended\"\n",
    ")\n",
    "type(calibrator.implementation)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0be12b52-8971-48e5-8404-7602fc3ad5e3",
   "metadata": {},
   "source": [
    "Finally, let's check that our calibrator is using the `NSGAIIISampler` sampling algorithm."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "ff13ac25-2a58-48e2-bc69-7f07cf9c2204",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "optuna.samplers._nsgaiii._sampler.NSGAIIISampler"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "calibrator.specify()\n",
    "type(calibrator.implementation.sampler)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a547813-2541-4009-83f2-a9ed6a06ab6a",
   "metadata": {},
   "source": [
    "Excellent. We can see that the `OptimisationMethod` calibration workflow is using the `optuna_extended` engine and `ExtendedOptunaOptimisation` class as defined by our newly installed plugin within the `calisim.external.optimisation` group."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.19"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}