Summary
Runtime schema generation via OSW.fetch_schema can write a syntactically invalid osw/model/entity.py and then importlib.reload() it, raising SyntaxError and breaking the process (and every subsequent import osw.core, since osw.core imports osw.model.entity).
The generated offending line:
risk_assessment: RiskAssessmentProcess | None = Field(
default_factory=lambda :RiskAssessmentProcess.parse_obj(<object object at 0x000001...>),
options={'$comment': 'hide inherited property', 'hidden': True},
)
<object object at 0x...> is a Python sentinel repr'd into source code, which is not valid Python.
How it is triggered
It is hit during lazy relation resolution (oold), not only on explicit load_entity:
- An entity with a relation (e.g. a
Process with tool/input/output) is loaded.
- Accessing the relation attribute triggers
oold __getattribute__ -> _resolve -> the OSW resolver backend OswDefaultBackend.resolve (osw/core.py).
- That backend calls
osw_obj.load_entity(OSW.LoadEntityParam(titles=request.iris)) without model_to_use and with autofetch_schema defaulting to True.
- In
load_entity, for each category it does if not hasattr(model, cls_name): if param.autofetch_schema: self.fetch_schema(...) where model is osw.model.entity.
fetch_schema -> _fetch_schema regenerates osw/model/entity.py and importlib.reload(model) -> SyntaxError.
Because the resolver passes no model_to_use, any class that is registered in a different module (e.g. a separately published cloud.* model package) is not found via hasattr(osw.model.entity, cls_name), so resolution always falls into the regeneration path.
Stacktrace
oold/model/v1/__init__.py:569 in __getattribute__ -> self._resolve(iris)
oold/model/v1/__init__.py:632 in _resolve -> resolver.resolve(...)
osw/core.py:138 in resolve -> osw_obj.load_entity(OSW.LoadEntityParam(titles=request.iris))
osw/core.py:1204 in load_entity -> self.fetch_schema(...)
osw/core.py:464 -> 1004 in _fetch_schema -> importlib.reload(model) # osw.model.entity
importlib ... source_to_code
File ".../osw/model/entity.py", line 7429
risk_assessment: RiskAssessmentProcess | None = Field(default_factory=lambda :RiskAssessmentProcess.parse_obj(<object object at 0x...>), options={'$comment': 'hide inherited property', 'hidden': True})
SyntaxError: invalid syntax
datamodel-code-generator also warns while emitting it:
datamodel_code_generator/parser/base.py: UserWarning: Failed to format code:
InvalidInput("Cannot parse for target version Python 3.10: ... RiskAssessmentProcess.parse_obj(<object object at 0x...>) ... ParseError: bad input"). Emitting unformatted output.
i.e. the generator fails to parse its own output but writes it anyway, and osw then imports it.
Root cause (likely)
A hidden/inherited property (risk_assessment: RiskAssessmentProcess, {'hidden': True}) carries a default that is a sentinel object. When the model is regenerated, that sentinel default is rendered into a default_factory=lambda: RiskAssessmentProcess.parse_obj(<object object>) instead of a valid expression. The result is committed to osw/model/entity.py despite the formatter raising InvalidInput.
Suggestions
- Don't write generated source if formatting/parse fails (fail fast instead of emitting unparseable code), or validate with
ast.parse before replacing entity.py.
- Render sentinel/undefined defaults for hidden properties safely (e.g.
None) rather than repr'ing the sentinel object.
- Possibly skip
fetch_schema regeneration when a usable model class is already importable from a registered external module.
Environment
- osw 1.1.2
- datamodel-code-generator 0.51.0
- pydantic 2.13.4
- Python 3.11.6 (Windows)
Summary
Runtime schema generation via
OSW.fetch_schemacan write a syntactically invalidosw/model/entity.pyand thenimportlib.reload()it, raisingSyntaxErrorand breaking the process (and every subsequentimport osw.core, sinceosw.coreimportsosw.model.entity).The generated offending line:
<object object at 0x...>is a Python sentinelrepr'd into source code, which is not valid Python.How it is triggered
It is hit during lazy relation resolution (oold), not only on explicit
load_entity:Processwithtool/input/output) is loaded.oold__getattribute__->_resolve-> the OSW resolver backendOswDefaultBackend.resolve(osw/core.py).osw_obj.load_entity(OSW.LoadEntityParam(titles=request.iris))withoutmodel_to_useand withautofetch_schemadefaulting toTrue.load_entity, for each category it doesif not hasattr(model, cls_name): if param.autofetch_schema: self.fetch_schema(...)wheremodelisosw.model.entity.fetch_schema->_fetch_schemaregeneratesosw/model/entity.pyandimportlib.reload(model)-> SyntaxError.Because the resolver passes no
model_to_use, any class that is registered in a different module (e.g. a separately publishedcloud.*model package) is not found viahasattr(osw.model.entity, cls_name), so resolution always falls into the regeneration path.Stacktrace
datamodel-code-generator also warns while emitting it:
i.e. the generator fails to parse its own output but writes it anyway, and osw then imports it.
Root cause (likely)
A hidden/inherited property (
risk_assessment: RiskAssessmentProcess,{'hidden': True}) carries a default that is a sentinel object. When the model is regenerated, that sentinel default is rendered into adefault_factory=lambda: RiskAssessmentProcess.parse_obj(<object object>)instead of a valid expression. The result is committed toosw/model/entity.pydespite the formatter raisingInvalidInput.Suggestions
ast.parsebefore replacingentity.py.None) rather thanrepr'ing the sentinel object.fetch_schemaregeneration when a usable model class is already importable from a registered external module.Environment