# Fair Forge ## Docs - [Generators API](https://fairforge.alquimia.ai/api-reference/generators.md): API reference for Fair Forge generators - [Metrics API](https://fairforge.alquimia.ai/api-reference/metrics.md): API reference for all Fair Forge metrics - [API Reference Overview](https://fairforge.alquimia.ai/api-reference/overview.md): Complete API reference for Fair Forge - [Runners API](https://fairforge.alquimia.ai/api-reference/runners.md): API reference for Fair Forge runners - [Schemas API](https://fairforge.alquimia.ai/api-reference/schemas.md): API reference for Fair Forge data schemas - [Architecture](https://fairforge.alquimia.ai/core-concepts/architecture.md): Understanding Fair Forge's core architecture and design patterns - [Dataset & Batch](https://fairforge.alquimia.ai/core-concepts/dataset-batch.md): Understanding the core data structures in Fair Forge - [Retriever](https://fairforge.alquimia.ai/core-concepts/retriever.md): Loading conversation data into Fair Forge with full-dataset and streaming modes - [Streaming Retrievers](https://fairforge.alquimia.ai/core-concepts/retriever-streaming.md): Process large datasets without loading everything into memory using stream_sessions and stream_batches - [Statistical Modes](https://fairforge.alquimia.ai/core-concepts/statistical-modes.md): Pluggable Frequentist and Bayesian statistical strategies across Fair Forge metrics - [AWS Lambda](https://fairforge.alquimia.ai/examples/aws-lambda.md): Deploy Fair Forge as serverless functions - [Jupyter Notebooks](https://fairforge.alquimia.ai/examples/jupyter-notebooks.md): Interactive examples for each Fair Forge metric - [Examples Overview](https://fairforge.alquimia.ai/examples/overview.md): Complete examples for Fair Forge usage - [Attributions](https://fairforge.alquimia.ai/explainability/attributions.md): Compute and interpret token attributions for language models - [Overview](https://fairforge.alquimia.ai/explainability/overview.md): Understand model decisions with token attribution analysis - [BaseGenerator](https://fairforge.alquimia.ai/generators/base-generator.md): The main class for generating synthetic test datasets - [Context Loaders](https://fairforge.alquimia.ai/generators/context-loaders.md): Load and chunk documentation for test generation - [Generators Overview](https://fairforge.alquimia.ai/generators/overview.md): Generate synthetic test datasets from your documentation - [Selection Strategies](https://fairforge.alquimia.ai/generators/strategies.md): Control how chunks are selected for test generation - [Installation](https://fairforge.alquimia.ai/installation.md): Install Fair Forge and its dependencies - [Introduction](https://fairforge.alquimia.ai/introduction.md): Fair Forge is a comprehensive performance-measurement library for evaluating AI models and assistants - [Agentic](https://fairforge.alquimia.ai/metrics/agentic.md): Evaluate AI agent responses with pass@K metrics, tool correctness, and pluggable statistical modes - [BestOf](https://fairforge.alquimia.ai/metrics/best-of.md): Tournament-style evaluation to compare multiple AI assistants - [Bias](https://fairforge.alquimia.ai/metrics/bias.md): Detect bias across protected attributes using guardian models with pluggable statistical modes - [Context](https://fairforge.alquimia.ai/metrics/context.md): Evaluate how well AI responses align with provided context, with session-level aggregation and pluggable statistical modes - [Conversational](https://fairforge.alquimia.ai/metrics/conversational.md): Evaluate dialogue quality using Grice's Maxims with session-level aggregation and pluggable statistical modes - [Humanity](https://fairforge.alquimia.ai/metrics/humanity.md): Analyze emotional depth and human-likeness of AI responses - [Metrics Overview](https://fairforge.alquimia.ai/metrics/overview.md): Overview of all available metrics in Fair Forge - [Prompt Evaluator](https://fairforge.alquimia.ai/metrics/prompt-evaluator.md): Score a system prompt using distributional signals — consistency, semantic entropy, and optional reference similarity - [Regulatory](https://fairforge.alquimia.ai/metrics/regulatory.md): Evaluate AI responses against regulatory compliance using RAG-based retrieval, reranking, and pluggable statistical modes - [Toxicity](https://fairforge.alquimia.ai/metrics/toxicity.md): Measure toxic language with clustering and demographic group profiling - [Vision](https://fairforge.alquimia.ai/metrics/vision.md): Evaluate Vision Language Model (VLM) scene descriptions using semantic similarity against human-annotated ground truth - [GEPA](https://fairforge.alquimia.ai/prompt-optimizer/gepa.md): Iteratively improve system prompts by learning from failures - [MIPROv2](https://fairforge.alquimia.ai/prompt-optimizer/miprov2.md): Optimize instruction and few-shot examples together using Bayesian search - [Prompt Optimizer](https://fairforge.alquimia.ai/prompt-optimizer/overview.md): Automatically improve AI agent system prompts using GEPA and MIPROv2 - [Quickstart](https://fairforge.alquimia.ai/quickstart.md): Get started with Fair Forge in minutes - [AlquimiaRunner](https://fairforge.alquimia.ai/runners/alquimia-runner.md): Execute tests against Alquimia AI agents - [Custom Runners](https://fairforge.alquimia.ai/runners/custom-runners.md): Create runners for any AI system - [Runners Overview](https://fairforge.alquimia.ai/runners/overview.md): Execute test datasets against AI systems ## OpenAPI Specs - [openapi](https://fairforge.alquimia.ai/api-reference/openapi.json)