Two code heaps#129369
Draft
AndyAyersMS wants to merge 6 commits into
Draft
Conversation
Squash-merge of amanasifkhalid:two-code-heaps#a676f29 onto current main. Single WIP commit, 13 months stale. Adds a per-LoaderAllocator separation between optimized (Tier1+) and non-optimized (Tier0/MinOpts/Instrumented) code heaps: * New CodeHeapRequestInfo::m_isOptimizedCode flag + IsOptimizedCode/ SetOptimizedCode accessors (PascalCase, matching HEAD's existing accessor convention). * Set via !pMD->IsJitOptimizationDisabled() in AllocCode (so Tier1, fully-optimized non-tiered, etc. land in optimized heaps; Tier0, Tier0-Instrumented, MinOpts in regular). * RangeSection::RANGE_SECTION_OPTIMIZEDCODE flag (assigned 0x20 since HEAD took 0x10 for VIRTUALIP); stamped on heaps in NewCodeHeap. * New LoaderAllocator cache pointers m_pLastUsedOptimizedCodeHeap and m_pLastUsedDynamicOptimizedCodeHeap; optimized-code allocations use the new caches. * CanUseCodeHeap rejects a heap up front if its OPTIMIZEDCODE flag doesn't match the request (moved from aman's tail-of-function retVal-based check to a top-of-function guard since HEAD's version returns directly from each success branch rather than using retVal). * m_isOptimizedCode initialized to false in the CodeHeapRequestInfo(MethodDesc*, LoaderAllocator*, BYTE*, BYTE*) ctor (the others delegate to it). Aman-side fixups during merge: * aman's lowercase getRequestSize/setRequestSize -> HEAD's PascalCase GetRequestSize/SetRequestSize. * aman's inline-defined ctors in codeman.h dropped; HEAD's separate ctor declarations + definitions in codeman.cpp are used and a single initializer added for m_isOptimizedCode. * aman's pInfo->m_pAllocator-> direct field access rewritten as pInfo->GetAllocator()->... (m_pAllocator is private in HEAD). * aman's RANGE_SECTION_OPTIMIZEDCODE value 0x10 collides with HEAD's new RANGE_SECTION_VIRTUALIP=0x10; moved to 0x20. Status carryovers from the original WIP: * The dynamic (LCG) optimized-heap path remains commented out - m_pLastUsedDynamicOptimizedCodeHeap is declared and zero-initialized but never read/written. * CanUseCodeHeap does a FindCodeRange lookup per allocation; could be optimized by storing the flag directly on HeapList. * No tests, no telemetry, no policy knob. Not yet built.
…he slot LCG / DynamicMethod methods are not tiering-eligible: MethodDesc:: IsJitOptimizationDisabled() returns the same answer for every LCG method within a single process (the per-method branch is gated on !IsNoMetadata() which is false for LCG; the chunk-wide branch is process- or module-global). There's no way to end up with mixed-optimization LCG code in one process, so splitting the dynamic code heap by optimization level would create at most one pool of each kind anyway. Cleanup: * In AllocCode, also gate SetOptimizedCode() on !IsDynamicDomain() so LCG requests never get the optimized classification. The dynamic code heap is therefore never tagged with RANGE_SECTION_OPTIMIZEDCODE. * Drop the m_pLastUsedDynamicOptimizedCodeHeap field on LoaderAllocator and its initialization, plus the two commented-out blocks in AllocCodeWorker that would have used it. * Strengthen the NewCodeHeap assertion: with the gating above, optimized code is mutually exclusive with both interpreter and dynamic-domain. Asserting both makes the invariant explicit. * Add a TODO note in CanUseCodeHeap explaining why the optimized-flag check is harmless for interpreter/LCG (they never carry the flag) and pointing at the FindCodeRange-lookup-per-allocation optimization for future work (caching the bit on HeapList itself). Build: build.cmd -s clr.runtime+clr.corelib -c Checked -> 0 errors, 0 warnings (1m23s).
… off) Adds a new INTERNAL retail config knob: DOTNET_SeparateOptimizedCodeHeaps (default 0) When set to a non-zero value, JIT'd methods that are not optimization- disabled get their own per-LoaderAllocator code heap, separate from Tier0/MinOpts/Tier0Instrumented code. When 0 (the default), all JIT'd code shares a single heap per LoaderAllocator, matching pre-change behavior. LCG and interpreter heaps are unaffected either way. In AllocCode, SetOptimizedCode() is now gated on three conditions: * !requestInfo.IsDynamicDomain() (LCG isn't tiering-eligible) * !pMD->IsJitOptimizationDisabled() (the actual opt classification) * DOTNET_SeparateOptimizedCodeHeaps != 0 (opt-in) Build clean; smoke tests pass in both modes: default -> ArrBoundBinaryOp PASSED, smoke exit=100 opt-in (=1) -> ArrBoundBinaryOp PASSED, smoke exit=100
Previously CanUseCodeHeap consulted ExecutionManager::FindCodeRange on every cache check to read the RangeSection's RANGE_SECTION_OPTIMIZEDCODE bit. The flag is set once at heap creation in NewCodeHeap and never changes, so cache it directly on the HeapList instead. * HeapList gets a new trailing bool isOptimizedCode field. Trailing placement means FakeHeapList/cDAC layout assertions are undisturbed (none of the mirrored consumers read this field). * NewCodeHeap stamps the cached flag from the same expression that builds the RangeSection flags. * Both HeapList constructors (LoaderCodeHeap::CreateCodeHeap and HostCodeHeap::InitializeHeapList) zero-initialize the field so NewCodeHeap's subsequent assignment is always meaningful, even on the path where the heap is created without the optimized flag. * CanUseCodeHeap drops the FindCodeRange lookup and reads pCodeHeap->isOptimizedCode directly. Build clean; smoke + ArrBoundBinaryOp PASSED in both default and DOTNET_SeparateOptimizedCodeHeaps=1 modes.
Predicate (!= 0 enables) unchanged; only the default value moves.
Contributor
|
Tagging subscribers to this area: @agocke |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces an “optimized code” classification for JIT code heap allocations and adds a separate per-LoaderAllocator “last used heap” cache intended to keep optimized and non-optimized JIT’d code in different heaps, gated by a new runtime config switch.
Changes:
- Adds a new
CodeHeapRequestInfoflag and aRangeSectionflag (RANGE_SECTION_OPTIMIZEDCODE) plus a cachedHeapList::isOptimizedCodebit to prevent heap/request mismatches. - Adds
LoaderAllocator::m_pLastUsedOptimizedCodeHeapand updatesAllocCodeWorkerto use it whenIsOptimizedCode()is set. - Adds a new runtime config knob
SeparateOptimizedCodeHeaps.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/loaderallocator.hpp | Adds a new per-LoaderAllocator cache slot for “optimized” code heaps. |
| src/coreclr/vm/loaderallocator.cpp | Initializes the new optimized code heap cache pointer. |
| src/coreclr/vm/dynamicmethod.cpp | Initializes the new HeapList::isOptimizedCode field for host code heaps. |
| src/coreclr/vm/codeman.h | Adds CodeHeapRequestInfo::m_isOptimizedCode, HeapList::isOptimizedCode, and the new RangeSection flag bit. |
| src/coreclr/vm/codeman.cpp | Tags new code heaps with the optimized flag, selects the optimized heap cache in allocation, and gates “optimized heap” routing on a new config switch. |
| src/coreclr/inc/clrconfigvalues.h | Adds the SeparateOptimizedCodeHeaps config entry. |
Comment on lines
521
to
525
| #ifdef FEATURE_PGO | ||
| RETAIL_CONFIG_STRING_INFO(INTERNAL_PGODataPath, W("PGODataPath"), "Read/Write PGO data from/to the indicated file.") | ||
| RETAIL_CONFIG_DWORD_INFO(INTERNAL_ReadPGOData, W("ReadPGOData"), 0, "Read PGO data") | ||
| RETAIL_CONFIG_DWORD_INFO(INTERNAL_WritePGOData, W("WritePGOData"), 0, "Write PGO data") | ||
| RETAIL_CONFIG_DWORD_INFO(EXTERNAL_TieredPGO, W("TieredPGO"), 1, "Instrument Tier0 code and make counts available to Tier1") |
Comment on lines
+3264
to
+3266
| if (!requestInfo.IsDynamicDomain() | ||
| && !pMD->IsJitOptimizationDisabled() | ||
| && CLRConfig::GetConfigValue(CLRConfig::INTERNAL_SeparateOptimizedCodeHeaps) != 0) |
Comment on lines
+3259
to
+3263
| // Optionally route optimized (Tier1+) code to its own per-LoaderAllocator | ||
| // heap. LCG (dynamic-domain) methods are excluded because they are not | ||
| // tiering-eligible: every LCG method within a single process uses the same | ||
| // JIT optimization level, so splitting their heap would create at most one | ||
| // pool of each kind anyway. Gated off by default (DOTNET_SeparateOptimizedCodeHeaps). |
* Move INTERNAL_SeparateOptimizedCodeHeaps declaration outside the
#ifdef FEATURE_PGO block. Browser-wasm and iOS simulator builds
disable FEATURE_PGO, so the unconditional reference from codeman.cpp
was breaking those builds:
src/coreclr/vm/codeman.cpp:3266:49: error: no member named
'INTERNAL_SeparateOptimizedCodeHeaps' in 'CLRConfig'
* Exclude interpreted requests from SetOptimizedCode(). When
FEATURE_INTERPRETER is on, AllocCode<InterpreterCodeHeader> sets
requestInfo.SetInterpreted() earlier; the new gate then allowed
SetOptimizedCode() to also be set, which violates the mutual-exclusion
invariant asserted in NewCodeHeap (_ASSERTE(!pInfo->IsInterpreted())
inside the IsOptimizedCode() branch).
* Reword the comment: the split is keyed off
MethodDesc::IsJitOptimizationDisabled() (attributes / global debug
flags / minopts), not the current compilation tier. The previous
'Tier1+' wording was misleading.
Build clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.