Skip to content

Two code heaps#129369

Draft
AndyAyersMS wants to merge 6 commits into
dotnet:mainfrom
AndyAyersMS:two-code-heaps
Draft

Two code heaps#129369
AndyAyersMS wants to merge 6 commits into
dotnet:mainfrom
AndyAyersMS:two-code-heaps

Conversation

@AndyAyersMS

Copy link
Copy Markdown
Member

No description provided.

Squash-merge of amanasifkhalid:two-code-heaps#a676f29 onto current main.
Single WIP commit, 13 months stale.

Adds a per-LoaderAllocator separation between optimized (Tier1+) and
non-optimized (Tier0/MinOpts/Instrumented) code heaps:

* New CodeHeapRequestInfo::m_isOptimizedCode flag + IsOptimizedCode/
  SetOptimizedCode accessors (PascalCase, matching HEAD's existing
  accessor convention).
* Set via !pMD->IsJitOptimizationDisabled() in AllocCode (so Tier1,
  fully-optimized non-tiered, etc. land in optimized heaps; Tier0,
  Tier0-Instrumented, MinOpts in regular).
* RangeSection::RANGE_SECTION_OPTIMIZEDCODE flag (assigned 0x20 since
  HEAD took 0x10 for VIRTUALIP); stamped on heaps in NewCodeHeap.
* New LoaderAllocator cache pointers m_pLastUsedOptimizedCodeHeap and
  m_pLastUsedDynamicOptimizedCodeHeap; optimized-code allocations use
  the new caches.
* CanUseCodeHeap rejects a heap up front if its OPTIMIZEDCODE flag
  doesn't match the request (moved from aman's tail-of-function
  retVal-based check to a top-of-function guard since HEAD's version
  returns directly from each success branch rather than using retVal).
* m_isOptimizedCode initialized to false in the
  CodeHeapRequestInfo(MethodDesc*, LoaderAllocator*, BYTE*, BYTE*)
  ctor (the others delegate to it).

Aman-side fixups during merge:
* aman's lowercase getRequestSize/setRequestSize -> HEAD's PascalCase
  GetRequestSize/SetRequestSize.
* aman's inline-defined ctors in codeman.h dropped; HEAD's separate
  ctor declarations + definitions in codeman.cpp are used and a single
  initializer added for m_isOptimizedCode.
* aman's pInfo->m_pAllocator-> direct field access rewritten as
  pInfo->GetAllocator()->... (m_pAllocator is private in HEAD).
* aman's RANGE_SECTION_OPTIMIZEDCODE value 0x10 collides with HEAD's
  new RANGE_SECTION_VIRTUALIP=0x10; moved to 0x20.

Status carryovers from the original WIP:
* The dynamic (LCG) optimized-heap path remains commented out -
  m_pLastUsedDynamicOptimizedCodeHeap is declared and zero-initialized
  but never read/written.
* CanUseCodeHeap does a FindCodeRange lookup per allocation; could be
  optimized by storing the flag directly on HeapList.
* No tests, no telemetry, no policy knob.

Not yet built.
…he slot

LCG / DynamicMethod methods are not tiering-eligible: MethodDesc::
IsJitOptimizationDisabled() returns the same answer for every LCG method
within a single process (the per-method branch is gated on !IsNoMetadata()
which is false for LCG; the chunk-wide branch is process- or module-global).
There's no way to end up with mixed-optimization LCG code in one process,
so splitting the dynamic code heap by optimization level would create at
most one pool of each kind anyway.

Cleanup:

* In AllocCode, also gate SetOptimizedCode() on !IsDynamicDomain() so LCG
  requests never get the optimized classification. The dynamic code heap
  is therefore never tagged with RANGE_SECTION_OPTIMIZEDCODE.

* Drop the m_pLastUsedDynamicOptimizedCodeHeap field on LoaderAllocator
  and its initialization, plus the two commented-out blocks in
  AllocCodeWorker that would have used it.

* Strengthen the NewCodeHeap assertion: with the gating above, optimized
  code is mutually exclusive with both interpreter and dynamic-domain.
  Asserting both makes the invariant explicit.

* Add a TODO note in CanUseCodeHeap explaining why the optimized-flag
  check is harmless for interpreter/LCG (they never carry the flag) and
  pointing at the FindCodeRange-lookup-per-allocation optimization for
  future work (caching the bit on HeapList itself).

Build: build.cmd -s clr.runtime+clr.corelib -c Checked -> 0 errors,
0 warnings (1m23s).
… off)

Adds a new INTERNAL retail config knob:
  DOTNET_SeparateOptimizedCodeHeaps  (default 0)

When set to a non-zero value, JIT'd methods that are not optimization-
disabled get their own per-LoaderAllocator code heap, separate from
Tier0/MinOpts/Tier0Instrumented code. When 0 (the default), all JIT'd
code shares a single heap per LoaderAllocator, matching pre-change
behavior. LCG and interpreter heaps are unaffected either way.

In AllocCode, SetOptimizedCode() is now gated on three conditions:
  * !requestInfo.IsDynamicDomain()  (LCG isn't tiering-eligible)
  * !pMD->IsJitOptimizationDisabled()  (the actual opt classification)
  * DOTNET_SeparateOptimizedCodeHeaps != 0  (opt-in)

Build clean; smoke tests pass in both modes:
  default       -> ArrBoundBinaryOp PASSED, smoke exit=100
  opt-in (=1)   -> ArrBoundBinaryOp PASSED, smoke exit=100
Previously CanUseCodeHeap consulted ExecutionManager::FindCodeRange on
every cache check to read the RangeSection's RANGE_SECTION_OPTIMIZEDCODE
bit. The flag is set once at heap creation in NewCodeHeap and never
changes, so cache it directly on the HeapList instead.

* HeapList gets a new trailing bool isOptimizedCode field. Trailing
  placement means FakeHeapList/cDAC layout assertions are undisturbed
  (none of the mirrored consumers read this field).
* NewCodeHeap stamps the cached flag from the same expression that
  builds the RangeSection flags.
* Both HeapList constructors (LoaderCodeHeap::CreateCodeHeap and
  HostCodeHeap::InitializeHeapList) zero-initialize the field so
  NewCodeHeap's subsequent assignment is always meaningful, even on
  the path where the heap is created without the optimized flag.
* CanUseCodeHeap drops the FindCodeRange lookup and reads
  pCodeHeap->isOptimizedCode directly.

Build clean; smoke + ArrBoundBinaryOp PASSED in both default and
DOTNET_SeparateOptimizedCodeHeaps=1 modes.
Predicate (!= 0 enables) unchanged; only the default value moves.
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an “optimized code” classification for JIT code heap allocations and adds a separate per-LoaderAllocator “last used heap” cache intended to keep optimized and non-optimized JIT’d code in different heaps, gated by a new runtime config switch.

Changes:

  • Adds a new CodeHeapRequestInfo flag and a RangeSection flag (RANGE_SECTION_OPTIMIZEDCODE) plus a cached HeapList::isOptimizedCode bit to prevent heap/request mismatches.
  • Adds LoaderAllocator::m_pLastUsedOptimizedCodeHeap and updates AllocCodeWorker to use it when IsOptimizedCode() is set.
  • Adds a new runtime config knob SeparateOptimizedCodeHeaps.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/coreclr/vm/loaderallocator.hpp Adds a new per-LoaderAllocator cache slot for “optimized” code heaps.
src/coreclr/vm/loaderallocator.cpp Initializes the new optimized code heap cache pointer.
src/coreclr/vm/dynamicmethod.cpp Initializes the new HeapList::isOptimizedCode field for host code heaps.
src/coreclr/vm/codeman.h Adds CodeHeapRequestInfo::m_isOptimizedCode, HeapList::isOptimizedCode, and the new RangeSection flag bit.
src/coreclr/vm/codeman.cpp Tags new code heaps with the optimized flag, selects the optimized heap cache in allocation, and gates “optimized heap” routing on a new config switch.
src/coreclr/inc/clrconfigvalues.h Adds the SeparateOptimizedCodeHeaps config entry.

Comment on lines 521 to 525
#ifdef FEATURE_PGO
RETAIL_CONFIG_STRING_INFO(INTERNAL_PGODataPath, W("PGODataPath"), "Read/Write PGO data from/to the indicated file.")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_ReadPGOData, W("ReadPGOData"), 0, "Read PGO data")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_WritePGOData, W("WritePGOData"), 0, "Write PGO data")
RETAIL_CONFIG_DWORD_INFO(EXTERNAL_TieredPGO, W("TieredPGO"), 1, "Instrument Tier0 code and make counts available to Tier1")
Comment on lines +3264 to +3266
if (!requestInfo.IsDynamicDomain()
&& !pMD->IsJitOptimizationDisabled()
&& CLRConfig::GetConfigValue(CLRConfig::INTERNAL_SeparateOptimizedCodeHeaps) != 0)
Comment thread src/coreclr/vm/codeman.cpp Outdated
Comment on lines +3259 to +3263
// Optionally route optimized (Tier1+) code to its own per-LoaderAllocator
// heap. LCG (dynamic-domain) methods are excluded because they are not
// tiering-eligible: every LCG method within a single process uses the same
// JIT optimization level, so splitting their heap would create at most one
// pool of each kind anyway. Gated off by default (DOTNET_SeparateOptimizedCodeHeaps).
* Move INTERNAL_SeparateOptimizedCodeHeaps declaration outside the
  #ifdef FEATURE_PGO block. Browser-wasm and iOS simulator builds
  disable FEATURE_PGO, so the unconditional reference from codeman.cpp
  was breaking those builds:
    src/coreclr/vm/codeman.cpp:3266:49: error: no member named
    'INTERNAL_SeparateOptimizedCodeHeaps' in 'CLRConfig'

* Exclude interpreted requests from SetOptimizedCode(). When
  FEATURE_INTERPRETER is on, AllocCode<InterpreterCodeHeader> sets
  requestInfo.SetInterpreted() earlier; the new gate then allowed
  SetOptimizedCode() to also be set, which violates the mutual-exclusion
  invariant asserted in NewCodeHeap (_ASSERTE(!pInfo->IsInterpreted())
  inside the IsOptimizedCode() branch).

* Reword the comment: the split is keyed off
  MethodDesc::IsJitOptimizationDisabled() (attributes / global debug
  flags / minopts), not the current compilation tier. The previous
  'Tier1+' wording was misleading.

Build clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants