update math to math dollar

souzatharsis · Dec 30, 2024 · 55183cb · 55183cb
1 parent f596e7a
commit 55183cb
Show file tree

Hide file tree

Showing 8 changed files with 11 additions and 9 deletions.
diff --git a/tamingllms/_build/.doctrees/environment.pickle b/tamingllms/_build/.doctrees/environment.pickle
diff --git a/tamingllms/_build/.doctrees/notebooks/alignment.doctree b/tamingllms/_build/.doctrees/notebooks/alignment.doctree
diff --git a/tamingllms/_build/html/_sources/notebooks/alignment.ipynb b/tamingllms/_build/html/_sources/notebooks/alignment.ipynb
@@ -257,9 +257,9 @@
     "\n",
     "At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:\n",
     "\n",
-    "```{math}\n",
+    "$$\n",
     "\\mathcal{L}_{\\text{DPO}}(\\pi_\\theta; \\pi_\\text{ref}) = -\\mathbb{E}_{(x,y_w,y_l) \\sim \\mathcal{D}} \\left[\\log \\sigma \\left(\\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_w | x)}{\\pi_\\text{ref}(y_w | x)}}_{\\color{green}\\text{preferred}} - \\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_l | x)}{\\pi_\\text{ref}(y_l | x)}}_{\\color{red}\\text{rejected}}\\right)\\right]\n",
-    "```\n",
+    "$$\n",
     "\n",
     "where, \n",
     "- $\\pi_\\theta$ represents the language model,\n",

diff --git a/tamingllms/_build/html/notebooks/alignment.html b/tamingllms/_build/html/notebooks/alignment.html
@@ -494,7 +494,9 @@ <h4><a class="toc-backref" href="#id255" role="doc-backlink"><span class="sectio
 </ol>
 <p>At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:</p>
 <div class="math notranslate nohighlight">
-\[\mathcal{L}_{\text{DPO}}(\pi_\theta; \pi_\text{ref}) = -\mathbb{E}_{(x,y_w,y_l) \sim \mathcal{D}} \left[\log \sigma \left(\beta \underbrace{\log \frac{\pi_\theta(y_w | x)}{\pi_\text{ref}(y_w | x)}}_{\color{green}\text{preferred}} - \beta \underbrace{\log \frac{\pi_\theta(y_l | x)}{\pi_\text{ref}(y_l | x)}}_{\color{red}\text{rejected}}\right)\right]\]</div>
+\[
+\mathcal{L}_{\text{DPO}}(\pi_\theta; \pi_\text{ref}) = -\mathbb{E}_{(x,y_w,y_l) \sim \mathcal{D}} \left[\log \sigma \left(\beta \underbrace{\log \frac{\pi_\theta(y_w | x)}{\pi_\text{ref}(y_w | x)}}_{\color{green}\text{preferred}} - \beta \underbrace{\log \frac{\pi_\theta(y_l | x)}{\pi_\text{ref}(y_l | x)}}_{\color{red}\text{rejected}}\right)\right]
+\]</div>
 <p>where,</p>
 <ul class="simple">
 <li><p><span class="math notranslate nohighlight">\(\pi_\theta\)</span> represents the language model,</p></li>

diff --git a/tamingllms/_build/jupyter_execute/markdown/intro.ipynb b/tamingllms/_build/jupyter_execute/markdown/intro.ipynb
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "39750f33",
+   "id": "50f8c7d6",
    "metadata": {},
    "source": [
     "(intro)=\n",

diff --git a/tamingllms/_build/jupyter_execute/notebooks/alignment.ipynb b/tamingllms/_build/jupyter_execute/notebooks/alignment.ipynb
@@ -257,9 +257,9 @@
     "\n",
     "At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:\n",
     "\n",
-    "```{math}\n",
+    "$$\n",
     "\\mathcal{L}_{\\text{DPO}}(\\pi_\\theta; \\pi_\\text{ref}) = -\\mathbb{E}_{(x,y_w,y_l) \\sim \\mathcal{D}} \\left[\\log \\sigma \\left(\\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_w | x)}{\\pi_\\text{ref}(y_w | x)}}_{\\color{green}\\text{preferred}} - \\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_l | x)}{\\pi_\\text{ref}(y_l | x)}}_{\\color{red}\\text{rejected}}\\right)\\right]\n",
-    "```\n",
+    "$$\n",
     "\n",
     "where, \n",
     "- $\\pi_\\theta$ represents the language model,\n",

diff --git a/tamingllms/_config.yml b/tamingllms/_config.yml
@@ -37,7 +37,7 @@ sphinx:
   config:
     html_context:
       default_mode: light
-    #mathjax_path: https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
+    mathjax_path: https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
     bibtex_reference_style: author_year
     html_theme: 'press' #insipid
     #html_logo: '_static/logo_w.png'

diff --git a/tamingllms/notebooks/alignment.ipynb b/tamingllms/notebooks/alignment.ipynb
@@ -257,9 +257,9 @@
     "\n",
     "At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:\n",
     "\n",
-    "```{math}\n",
+    "$$\n",
     "\\mathcal{L}_{\\text{DPO}}(\\pi_\\theta; \\pi_\\text{ref}) = -\\mathbb{E}_{(x,y_w,y_l) \\sim \\mathcal{D}} \\left[\\log \\sigma \\left(\\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_w | x)}{\\pi_\\text{ref}(y_w | x)}}_{\\color{green}\\text{preferred}} - \\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_l | x)}{\\pi_\\text{ref}(y_l | x)}}_{\\color{red}\\text{rejected}}\\right)\\right]\n",
-    "```\n",
+    "$$\n",
     "\n",
     "where, \n",
     "- $\\pi_\\theta$ represents the language model,\n",