Skip to content

MLX: expose topP / topK / minP / repetitionPenalty via CustomGenerati…#168

Open
SpiraMira wants to merge 3 commits into
huggingface:mainfrom
SpiraMira:feature/mlx-sampler-params
Open

MLX: expose topP / topK / minP / repetitionPenalty via CustomGenerati…#168
SpiraMira wants to merge 3 commits into
huggingface:mainfrom
SpiraMira:feature/mlx-sampler-params

Conversation

@SpiraMira

Copy link
Copy Markdown

…onOptions

The MLX backend hardcoded sampling parameters in toGenerateParameters / toStructuredGenerateParameters (topP: 1.0, repetitionPenalty: nil, topK/minP at defaults) and never read GenerationOptions.sampling, so callers could only tune temperature and maximumResponseTokens. MLXLMCommon.GenerateParameters already supports the full set.

Add topP / topK / minP / repetitionPenalty / repetitionContextSize to MLXLanguageModel.CustomGenerationOptions (all optional, default nil → existing behavior unchanged) and forward them in both parameter mappers, preserving each path's prior defaults via custom?.field ?? <previous default>.

Implements #165.

per @pcuenca looking into bridging issues with (new) FoundationModels

SpiraMira and others added 3 commits June 30, 2026 03:42
…onOptions

The MLX backend hardcoded sampling parameters in toGenerateParameters /
toStructuredGenerateParameters (topP: 1.0, repetitionPenalty: nil, topK/minP at
defaults) and never read GenerationOptions.sampling, so callers could only tune
temperature and maximumResponseTokens. MLXLMCommon.GenerateParameters already
supports the full set.

Add topP / topK / minP / repetitionPenalty / repetitionContextSize to
MLXLanguageModel.CustomGenerationOptions (all optional, default nil → existing
behavior unchanged) and forward them in both parameter mappers, preserving each
path's prior defaults via `custom?.field ?? <previous default>`.

Implements huggingface#165.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reads the core GenerationOptions.sampling (SamplingMode) in toGenerateParameters /
toStructuredGenerateParameters so top-p/top-k/greedy set via the standard sampling
surface reach MLX, not only the custom block. Precedence: custom block wins, then
sampling-derived, then existing default. Seed is not forwarded (no per-call seed in
MLXLMCommon.GenerateParameters). Adds derivation + precedence tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ty gated)

toFoundationModels() adopts the OS 27 GenerationOptions initializer (adds toolCallingMode)
when built with the Xcode 27 SDK, gated by #if compiler(>=6.4) + #available(macOS/iOS/visionOS 27).
Falls back to the existing 26 construction otherwise; deployment floor stays 26. Self-contained
on this branch (no FM-parity mapping dependency).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@SpiraMira

SpiraMira commented Jul 2, 2026

Copy link
Copy Markdown
Author

@pcuenca and @james-333i - my attempt at assuaging Pedro’s concerns...

Additional changes

1. Bridge GenerationOptions.sampling into the MLX sampler parameters

The previous change exposed topP/topK/minP/repetitionPenalty on MLXLanguageModel.CustomGenerationOptions, but the MLX backend read those only from that custom block — it ignored the core GenerationOptions.sampling (SamplingMode) that Apple FoundationModels uses. This meant a caller had to express top-p/top-k twice, in two different shapes.

This commit makes GenerationOptions.sampling a unified sampling surface: toGenerateParameters / toStructuredGenerateParameters now also derive MLX params from SamplingMode:

  • .greedy → temperature 0 (argmax)
  • .random(top: k)topK
  • .random(probabilityThreshold: p)topP

Precedence is custom block → sampling-derived → existing default, so anything already setting the custom block is unchanged. The SamplingMode seed is not forwarded (MLXLMCommon.GenerateParameters has no per-call seed). Adds unit tests for the derivation and the precedence.

2. OS 27 toolCallingMode support in the Apple FoundationModels adapter

toFoundationModels() now adopts the OS 27 GenerationOptions initializer (which adds toolCallingMode) when built against the Xcode 27 SDK, guarded by #if compiler(>=6.4) + #available(macOS 27/iOS 27/visionOS 27). On older toolchains it falls back to the existing 26 initializer, so the deployment floor stays 26 and it still builds on Xcode 26. (toolCallingMode is passed nil for now — this is the hook for wiring tool-calling later.)

NOTE: I am unable to test the OS 27 support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant