PerformanceEstimation · AdrienTaylor · Feb 2, 2026 · Feb 2, 2026
diff --git a/ressources/demo/PEPit_demo_extracting_a_proof.ipynb b/ressources/demo/PEPit_demo_extracting_a_proof.ipynb
@@ -47,7 +47,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 1,
    "id": "6ae4cd3b-b120-47b0-a37c-7d5a124bc0a6",
    "metadata": {
     "tags": []
@@ -64,7 +64,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 2,
    "id": "89818bda-47bd-4929-831d-e05706b2a27c",
    "metadata": {
     "tags": []
@@ -93,7 +93,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 55,
+   "execution_count": 3,
    "id": "26925393-8cef-46d8-b1a9-291aade9e3dd",
    "metadata": {
     "tags": []
@@ -910,7 +910,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 120,
+   "execution_count": 4,
    "id": "8c4fa25f-ee3e-4ad9-ad7c-9753296287a5",
    "metadata": {
     "tags": []
@@ -1587,18 +1587,181 @@
     "## 6. Using symbolic regression (PySR) <a class=\"anchor\" id=\"example6\"></a>"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "c90bea86",
+   "metadata": {},
+   "source": [
+    "Up to this point, two complementary approaches have been used to obtain the Lagrange multipliers needed to reconstruct a proof:\n",
+    "\n",
+    "1. **By inspection** — guess the functional form from numerical data  \n",
+    "2. **Symbolic exploitation of the LMI** — exploit the LMI to solve for the multipliers symbolically  \n",
+    "\n",
+    "This section focuses on automating the “inspection” step in (1).\n",
+    "\n",
+    "The task can be framed as **symbolic regression**: learning a function from measurements without restricting it to a prescribed functional form.\n",
+    "\n",
+    "Symbolic regression is **NP-hard** in general, but effective heuristics are often available in practice. This demo uses [PySR](https://github.com/MilesCranmer/PySR).  \n",
+    "Install it first (note: PySR’s backend is implemented in the Julia programming language)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3ca45317",
+   "metadata": {},
+   "source": [
+    "First, imagine that you do not know the closed form of $\\lambda_1(\\gamma)$ given at the end of Section 3. The next cell automatically learns the closed form of $\\lambda_1(\\gamma)$ using measurements from PEPit:"
+   ]
+  },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "id": "1300c9e0-4d69-4183-a57e-00c5a1ccaa0b",
    "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Detected IPython. Loading juliacall extension. See https://juliapy.github.io/PythonCall.jl/stable/compat/#IPython\n"
+     ]
+    }
+   ],
+   "source": [
+    "import numpy as np\n",
+    "from pysr import PySRRegressor\n",
+    "\n",
+    "# seed for reproducibility\n",
+    "np.random.seed(42)\n",
+    "\n",
+    "X = []\n",
+    "y = []\n",
+    "mu = 0.1\n",
+    "L = 1\n",
+    "gamma_list = np.linspace(0.01, 1.99, 20)\n",
+    "\n",
+    "for gamma in gamma_list:\n",
+    "    pepit_tau, list_of_constraints = wc_gradient_descent_function_values_sparse_proof(mu, L, gamma, verbose=0)\n",
+    "    l1 = list_of_constraints[2]._dual_variable_value\n",
+    "    \n",
+    "    X.append([np.sqrt(pepit_tau)])\n",
+    "    y.append(l1)\n",
+    "\n",
+    "model = PySRRegressor(\n",
+    "    maxsize=10,\n",
+    "    niterations=10,\n",
+    "    binary_operators=[\"+\", \"-\", \"*\"],\n",
+    "    verbosity=0,\n",
+    "    progress=False,\n",
+    "    deterministic=True, # Just to maintain consistency in outputs\n",
+    "    parallelism='serial',\n",
+    ")\n",
+    "    \n",
+    "result = model.fit(np.array(X), np.array(y), variable_names=[\"rho\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "51806a09",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "(1.0 - rho) * rho\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(model.get_best().equation)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "293091d7",
+   "metadata": {},
+   "source": [
+    "This recovers the expression $\\lambda_1(\\gamma)= (1-\\rho(\\gamma))\\rho(\\gamma)$. This was a simple, univariate function, which is where symbolic regression works best.\n",
+    "\n",
+    "The next cell uses the flexibility of the PySR API in order to learn the convergence rate of gradient descent."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "59855db5",
+   "metadata": {},
    "outputs": [],
-   "source": []
+   "source": [
+    "import itertools\n",
+    "np.random.seed(42)\n",
+    "\n",
+    "X = []\n",
+    "y = []\n",
+    "\n",
+    "# Use different values for L and mu to generate dataset\n",
+    "L_values = [1, 2, 10]\n",
+    "mu_values = [0.1, 0.9]\n",
+    "\n",
+    "for L_val in L_values:\n",
+    "    for mu_val in mu_values:\n",
+    "        \n",
+    "        # Generate uniform points between 0 and 2/L (at which point we do not have convergence)\n",
+    "        limit = 2.0 / L_val\n",
+    "        gammas = np.linspace(0.01, limit - 0.01, 5)\n",
+    "        \n",
+    "        for g_val in gammas:\n",
+    "            pepit_tau, _ = wc_gradient_descent_function_values_sparse_proof(mu_val, L_val, g_val, verbose=0)\n",
+    "            X.append([mu_val, L_val, g_val])\n",
+    "            y.append(pepit_tau)\n",
+    "\n",
+    "# Increased maxsize to allow for the expression complexity\n",
+    "model = PySRRegressor(\n",
+    "    niterations=200,\n",
+    "    binary_operators=[\"+\", \"-\", \"*\", \"max\"],  # PySR supports many operators, such as max, min, abs, etc\n",
+    "    unary_operators=[\"square\"],\n",
+    "    maxsize=15,\n",
+    "    verbosity=0,\n",
+    "    progress=False,\n",
+    "    deterministic=True,\n",
+    "    parallelism='serial',\n",
+    ")\n",
+    "\n",
+    "result = model.fit(np.array(X), np.array(y), variable_names=[\"mu\", \"L\", \"g\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "d8f49ee1",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "square(max(1.0000066 - (g * mu), (g * L) + -0.99999446))\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(model.get_best().equation)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7e518abc",
+   "metadata": {},
+   "source": [
+    "PySR was able to effectively learn this convergence rate, which is a function of 3 different variables. This did, however, require guiding it to use the max and square operators. In other problems, a larger number of operators should be used, which does decrease the speed of convergence of the heuristic. This is why this type of approach works best for problems with relatively simple closed forms. When it works, it can save a lot of time."
+   ]
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "pepit",
    "language": "python",
    "name": "python3"
   },
@@ -1612,7 +1775,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.14.2"
   }
  },
  "nbformat": 4,