Skip to content

Commit

Permalink
update math to math dollar
Browse files Browse the repository at this point in the history
  • Loading branch information
souzatharsis committed Dec 30, 2024
1 parent f596e7a commit 55183cb
Show file tree
Hide file tree
Showing 8 changed files with 11 additions and 9 deletions.
Binary file modified tamingllms/_build/.doctrees/environment.pickle
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/alignment.doctree
Binary file not shown.
4 changes: 2 additions & 2 deletions tamingllms/_build/html/_sources/notebooks/alignment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -257,9 +257,9 @@
"\n",
"At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:\n",
"\n",
"```{math}\n",
"$$\n",
"\\mathcal{L}_{\\text{DPO}}(\\pi_\\theta; \\pi_\\text{ref}) = -\\mathbb{E}_{(x,y_w,y_l) \\sim \\mathcal{D}} \\left[\\log \\sigma \\left(\\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_w | x)}{\\pi_\\text{ref}(y_w | x)}}_{\\color{green}\\text{preferred}} - \\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_l | x)}{\\pi_\\text{ref}(y_l | x)}}_{\\color{red}\\text{rejected}}\\right)\\right]\n",
"```\n",
"$$\n",
"\n",
"where, \n",
"- $\\pi_\\theta$ represents the language model,\n",
Expand Down
4 changes: 3 additions & 1 deletion tamingllms/_build/html/notebooks/alignment.html
Original file line number Diff line number Diff line change
Expand Up @@ -494,7 +494,9 @@ <h4><a class="toc-backref" href="#id255" role="doc-backlink"><span class="sectio
</ol>
<p>At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:</p>
<div class="math notranslate nohighlight">
\[\mathcal{L}_{\text{DPO}}(\pi_\theta; \pi_\text{ref}) = -\mathbb{E}_{(x,y_w,y_l) \sim \mathcal{D}} \left[\log \sigma \left(\beta \underbrace{\log \frac{\pi_\theta(y_w | x)}{\pi_\text{ref}(y_w | x)}}_{\color{green}\text{preferred}} - \beta \underbrace{\log \frac{\pi_\theta(y_l | x)}{\pi_\text{ref}(y_l | x)}}_{\color{red}\text{rejected}}\right)\right]\]</div>
\[
\mathcal{L}_{\text{DPO}}(\pi_\theta; \pi_\text{ref}) = -\mathbb{E}_{(x,y_w,y_l) \sim \mathcal{D}} \left[\log \sigma \left(\beta \underbrace{\log \frac{\pi_\theta(y_w | x)}{\pi_\text{ref}(y_w | x)}}_{\color{green}\text{preferred}} - \beta \underbrace{\log \frac{\pi_\theta(y_l | x)}{\pi_\text{ref}(y_l | x)}}_{\color{red}\text{rejected}}\right)\right]
\]</div>
<p>where,</p>
<ul class="simple">
<li><p><span class="math notranslate nohighlight">\(\pi_\theta\)</span> represents the language model,</p></li>
Expand Down
2 changes: 1 addition & 1 deletion tamingllms/_build/jupyter_execute/markdown/intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "39750f33",
"id": "50f8c7d6",
"metadata": {},
"source": [
"(intro)=\n",
Expand Down
4 changes: 2 additions & 2 deletions tamingllms/_build/jupyter_execute/notebooks/alignment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -257,9 +257,9 @@
"\n",
"At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:\n",
"\n",
"```{math}\n",
"$$\n",
"\\mathcal{L}_{\\text{DPO}}(\\pi_\\theta; \\pi_\\text{ref}) = -\\mathbb{E}_{(x,y_w,y_l) \\sim \\mathcal{D}} \\left[\\log \\sigma \\left(\\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_w | x)}{\\pi_\\text{ref}(y_w | x)}}_{\\color{green}\\text{preferred}} - \\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_l | x)}{\\pi_\\text{ref}(y_l | x)}}_{\\color{red}\\text{rejected}}\\right)\\right]\n",
"```\n",
"$$\n",
"\n",
"where, \n",
"- $\\pi_\\theta$ represents the language model,\n",
Expand Down
2 changes: 1 addition & 1 deletion tamingllms/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ sphinx:
config:
html_context:
default_mode: light
#mathjax_path: https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
mathjax_path: https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
bibtex_reference_style: author_year
html_theme: 'press' #insipid
#html_logo: '_static/logo_w.png'
Expand Down
4 changes: 2 additions & 2 deletions tamingllms/notebooks/alignment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -257,9 +257,9 @@
"\n",
"At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:\n",
"\n",
"```{math}\n",
"$$\n",
"\\mathcal{L}_{\\text{DPO}}(\\pi_\\theta; \\pi_\\text{ref}) = -\\mathbb{E}_{(x,y_w,y_l) \\sim \\mathcal{D}} \\left[\\log \\sigma \\left(\\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_w | x)}{\\pi_\\text{ref}(y_w | x)}}_{\\color{green}\\text{preferred}} - \\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_l | x)}{\\pi_\\text{ref}(y_l | x)}}_{\\color{red}\\text{rejected}}\\right)\\right]\n",
"```\n",
"$$\n",
"\n",
"where, \n",
"- $\\pi_\\theta$ represents the language model,\n",
Expand Down

0 comments on commit 55183cb

Please sign in to comment.