diff --git a/cookbooks/Nomadic_Prompt_Optimization_Report.html b/cookbooks/Nomadic_Prompt_Optimization_Report.html new file mode 100644 index 0000000..f67272a --- /dev/null +++ b/cookbooks/Nomadic_Prompt_Optimization_Report.html @@ -0,0 +1,254 @@ + + + +
+ + ++ The RL Prompt Optimizer employs a reinforcement learning framework to iteratively improve prompts used for language model evaluations. + At each episode, the agent selects an action to modify the current prompt based on the state representation, which encodes features of the prompt. + The agent receives rewards based on a multi-metric evaluation of the model's responses, encouraging the development of prompts that elicit high-quality answers. +
+ +
+ weights = {
+ "faithfulness": 0.4, # Context adherence
+ "correctness": 0.3, # Response accuracy
+ "relevance": 0.2, # Query relevance
+ "clarity": 0.1 # Comprehensibility
+ }
+ Metric | +Value | +
---|---|
Best Score Achieved | +0.7333333333333333 | +
Average Convergence Time | +50.0 episodes | +
Mean Q-Value | +0.175 | +
+ The following interactive visualizations illustrate various aspects of the RL prompt optimization process: +
+