corrections to intro and structured output

souzatharsis · Dec 29, 2024 · 35c3aa8 · 35c3aa8
1 parent 617c41f
commit 35c3aa8
Show file tree

Hide file tree

Showing 44 changed files with 1,528 additions and 1,486 deletions.
diff --git a/tamingllms/_build/.doctrees/environment.pickle b/tamingllms/_build/.doctrees/environment.pickle
diff --git a/tamingllms/_build/.doctrees/markdown/intro.doctree b/tamingllms/_build/.doctrees/markdown/intro.doctree
diff --git a/tamingllms/_build/.doctrees/markdown/preface.doctree b/tamingllms/_build/.doctrees/markdown/preface.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/alignment.doctree b/tamingllms/_build/.doctrees/notebooks/alignment.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/cost.doctree b/tamingllms/_build/.doctrees/notebooks/cost.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/evals.doctree b/tamingllms/_build/.doctrees/notebooks/evals.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/input.doctree b/tamingllms/_build/.doctrees/notebooks/input.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/local.doctree b/tamingllms/_build/.doctrees/notebooks/local.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/safety.doctree b/tamingllms/_build/.doctrees/notebooks/safety.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/structured_output.doctree b/tamingllms/_build/.doctrees/notebooks/structured_output.doctree
diff --git a/tamingllms/_build/html/.buildinfo b/tamingllms/_build/html/.buildinfo
@@ -1,4 +1,4 @@
 # Sphinx build info version 1
 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: aab59d8e9e025d27e423c71d7418e3df
+config: b72ab834db8f31605e361f26c8f2a57b
 tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/tamingllms/_build/html/_sources/markdown/intro.md b/tamingllms/_build/html/_sources/markdown/intro.md
@@ -22,24 +22,24 @@ I am always doing that which I cannot do, in order that I may learn how to do it
 
 ## Core Challenges We'll Address
 
-In recent years, Large Language Models (LLMs) have emerged as a transformative force in technology, promising to revolutionize how we build products and interact with computers. From ChatGPT to GitHub Copilot and Claude Artifacts these systems have captured the public imagination and sparked a gold rush of AI-powered applications. However, beneath the surface of this technological revolution lies a complex landscape of challenges that practitioners must navigate. 
+In recent years, Large Language Models (LLMs) have emerged as a transformative force in technology, promising to revolutionize how we build products and interact with computers. From ChatGPT and LLama to GitHub Copilot and Claude Artifacts these systems have captured the public imagination and sparked a gold rush of AI-powered applications. However, beneath the surface of this technological revolution lies a complex landscape of challenges that software developers and tech leaders must navigate. 
 
-This book focuses on bringing awareness to key LLM limitations and harnessing open source solutions to overcome them for building robust AI-powered products. It offers a critical perspective on implementation challenges, backed by practical and reproducible Python examples. While many resources cover the capabilities of LLMs, this book specifically addresses the hidden complexities and pitfalls that engineers and technical product managers face when building LLM-powered applications while offering a comprehensive guide on how to leverage battle-tested open source tools and solutions.
+This book focuses on bringing awareness to key LLM limitations and harnessing open source solutions to overcome them for building robust AI-powered products. It offers a critical perspective on implementation challenges, backed by practical and reproducible Python examples. While many resources cover the capabilities of LLMs, this book specifically addresses the hidden complexities and pitfalls that engineers and technical leaders face when building LLM-powered applications while offering a comprehensive guide on how to leverage battle-tested open source tools and solutions.
 
 
 Throughout this book, we'll tackle the following (non-exhaustive) list of critical challenges:
 
 1. **Structural (un)Reliability**: LLMs struggle to maintain consistent output formats, complicating their integration into larger systems and making error handling more complex.
 
-2. **Format, Size and Length Constraints**: LLMs are sensitive to input data format and size requiring careful management strategies to handle long-form unstructured content effectively.
+2. **Input Data Management**: LLMs are sensitive to input data format, operate with stale data and struggle with long-context requiring careful input data management and retrieval strategies.
 
 3. **Testing Complexity**: Traditional software testing methodologies break down when dealing with non-deterministic and generative systems, requiring new approaches.
 
 4. **Safety and Alignment**: LLMs can generate harmful, biased, or inappropriate content, requiring robust safeguards and monitoring systems to ensure safe deployment.
 
-5. **Cost Optimization**: The computational and financial costs of operating LLM-based systems can quickly become prohibitive without careful management, and optimization.
+5. **Vendor Lock-in**: Cloud-based LLM providers can create significant dependencies and lock-in through their proprietary APIs and infrastructure, making it difficult to switch providers or self-host solutions.
 
-6. **Vendor Lock-in**: Cloud-based LLM providers can create significant dependencies and lock-in through their proprietary APIs and infrastructure, making it difficult to switch providers or self-host solutions.
+6. **Cost Optimization**: The computational and financial costs of operating LLM-based systems can quickly become prohibitive without careful management, and optimization.
 
 
 ## A Practical Approach
@@ -48,9 +48,8 @@ This book takes a hands-on approach to these challenges, with a focus on accessi
 All examples and code are:
 
 - Fully reproducible and documented, allowing readers to replicate results exactly
-- Designed to run on consumer-grade hardware without requiring expensive hardware
+- Designed to run on consumer-grade hardware without requiring expensive resources
 - Available as open source Python notebooks that can be modified and extended
-- Built using free and open source tools accessible to everyone
 - Structured to minimize computational costs while maintaining effectiveness
 
 ## An Open Source Approach
@@ -71,23 +70,21 @@ In keeping with these open source principles, this book itself is open source an
 - Share their own experiences and solutions with the community
 - Propose new chapters or sections that address emerging challenges
 
-The repository can be found at https://github.com/souzatharsis/tamingllms. Whether you've found a typo, have a better solution to share, or want to contribute an entirely new section, your contributions are welcome.
+The repository can be found at https://github.com/souzatharsis/tamingllms. Whether you've found a typo, have a better solution to share, or want to contribute, your contributions are welcome. Please feel free to open an issue in the book repository.
 
 
 ## A Note on Perspective
 
 While this book takes a critical look at LLM limitations, our goal is not to discourage their use but to enable more robust and reliable implementations. By understanding these challenges upfront, you'll be better equipped to build systems that leverage LLMs effectively while avoiding common pitfalls.
 
-The current discourse around LLMs tends toward extremes—either uncritical enthusiasm or wholesale dismissal. This book takes a different approach:
+The current discourse around LLMs tends toward extremes - either uncritical enthusiasm or wholesale dismissal. This book takes a different approach:
 
 - **Practical Implementation Focus**: Rather than theoretical capabilities, we examine practical challenges and their solutions.
 - **Code-First Learning**: Every concept is illustrated with executable Python examples, enabling immediate practical application.
 - **Critical Analysis**: We provide a balanced examination of both capabilities and limitations, helping readers make informed decisions about LLM integration.
 
 ## Who This Book Is For
 
-This book is intended for Software Developers taking their first steps with Large Language Models. It provides critical insights into the practical challenges of LLM implementation, along with guidance on leveraging open source tools and frameworks to avoid common pitfalls that could derail projects. The goal is to help developers understand and address these challenges early, before they become costly problems too late in the software development lifecycle. 
-
 This book is designed for: 
 
 - Software/AI Engineers building LLM-powered applications 
@@ -96,7 +93,6 @@ This book is designed for:
 - Open Source advocates and/or developers building LLM Applications 
 - Anyone seeking to understand the practical challenges of working with LLMs 
 
-
 Typical job roles:
 
 - Software/AI Engineers building AI-powered platforms
@@ -112,8 +108,9 @@ Reader motivation:
 - Requirement to optimize costs and performance
 - Need to ensure safety and reliability in LLM-powered systems
 
-## Outcomes
+The goal is to help readers understand and address these challenges early, before they become costly problems too late in the software development lifecycle. 
 
+## Outcomes
 
 After reading this book, the reader will understand critical LLM limitations and their implications and have practical experience on recommended open source tools and frameworks to help navigate common LLM pitfalls. The reader will be able to:
 
@@ -130,10 +127,8 @@ To make the most of this book, you should have:
 
 - Basic Python programming experience
 - Basic knowledge of LLMs and their capabilities
-- Introductory experience with LangChain (e.g. Chat Models and Prompt Templates)
-- Access to and basic knowledge of LLM APIs (OpenAI, Anthropic, or similar)
-- A desire to build reliable, production-grade LLM-powered products
-
+- Access to and basic knowledge of LLM APIs (Mistral, OpenAI, Anthropic, or similar)
+- A desire to build reliable LLM-based applications
 
 ## Setting Up Your Environment
 
@@ -152,7 +147,6 @@ cd tamingllms/notebooks
 python -m venv taming-llms-env
 source taming-llms-env/bin/activate  # On Windows, use: taming-llms-env\Scripts\activate
 ```
-
 We will try and make each Chapter as self-contained as possible, including all necessary installs as we go through the examples.
 Feel free to use your preferred package manager to install the dependencies (e.g. `pip`). We used `poetry` to manage dependencies and virtual environments.
 
@@ -177,8 +171,7 @@ Now that your environment is set up, let's begin our exploration of LLM challeng
 
 ## About the Author
 
-Dr. Tharsis Souza is a computer scientist and product leader specializing in AI-based products. He is a Lecturer at Columbia University's Master of Science program in Applied Analytics, (*incoming*) Head of Product, Equities at Citadel, and former Senior VP at Two Sigma Investments. He also enjoys mentoring under-represented students & working professionals to help create a more diverse global AI ecosystem.
+Tharsis Souza (Ph.D. Computer Science, UCL University of London) is a computer scientist and product leader specializing in AI-based products. He is a Lecturer at Columbia University's Master of Science program in Applied Analytics, (*incoming*) Head of Product, Equities at Citadel, and former Senior VP at Two Sigma Investments. He mentors under-represented students & working professionals to help create a more diverse global AI1 ecosystem.
 
-With over 15 years of experience delivering technology products across startups and Fortune 500 companies, Dr. Souza is also an author of numerous scholarly publications and a frequent speaker at academic and business conferences. Grounded on academic background and drawing from practical experience building and scaling up products powered by language models at early-stage startups, major institutions as well as advising non-profit organizations, and contributing to open source projects, he brings a unique perspective on bridging the gap between LLMs promised potential and their practical implementation challenges to enable the next generation of AI-powered products.
+With over 15 years of experience delivering technology products across startups and Fortune 500 companies, he is also an author of numerous scholarly publications and a frequent speaker at academic and business conferences. Grounded on academic background and drawing from practical experience building and scaling up products powered by language models at early-stage startups, major institutions as well as contributing to open source projects, he brings a unique perspective on bridging the gap between LLMs promised potential and their practical implementation challenges to enable the next generation of AI-powered products.
 
-Dr. Tharsis holds a Ph.D. in Computer Science from UCL, University of London following an M.Phil. and M.Sc. in Computer Science and a B.Sc. in Computer Engineering.
diff --git a/tamingllms/_build/html/_sources/notebooks/alignment.ipynb b/tamingllms/_build/html/_sources/notebooks/alignment.ipynb
@@ -2582,7 +2582,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Citation\n",
+    "\n",
     "[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]\n",
     "\n",
     "[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/\n",

diff --git a/tamingllms/_build/html/_sources/notebooks/cost.ipynb b/tamingllms/_build/html/_sources/notebooks/cost.ipynb
@@ -6,6 +6,7 @@
    "source": [
     "(cost)=\n",
     "# The Falling Cost Paradox\n",
+    "\n",
     "```{epigraph}\n",
     "It is a confusion of ideas to suppose that the economical use of fuel is equivalent to diminished consumption. <br>\n",
     "The very contrary is the truth. \n",
@@ -14,10 +15,9 @@
     "```\n",
     "```{contents}\n",
     "```\n",
-    "\n",
     "```{note}\n",
     "This Chapter is Work-in-Progress.\n",
-    "```"
+    "```\n"
    ]
   },
   {