How REX Validates OpenMP Semantic Analysis With `checkOmpAnalyzing`
checkOmpAnalyzing is the narrow semantic-analysis layer inside REX’s OpenMP test stack. Unlike the broad OpenMP_tests AST-only corpus, it does not diff generated source. Unlike lowering tests, it does not inspect runtime artifacts. It runs the frontend with -rose:openmp:analyzing -rose:skipfinalCompileStep, then queries the constructed OpenMP AST directly and asserts semantic-analysis facts such as: default schedules are represented as schedule(static) without inventing modifiers or chunk sizes, dynamic or guided schedules synthesize chunk size 1, and target parallel for gains the expected implicit map clause. This makes it the cheapest layer for catching analysis drift that is too semantic for frontend output diffs but too early to involve lowering.The previous post in this series covered the broad OpenMP_tests frontend corpus: parseOmp in ast_only mode, hundreds of C and C++ cases, a mixed OpenMP/OpenACC slice, and a separate Fortran slice, all aimed at one question:
can the frontend build and unparse OpenMP ASTs correctly across a large corpus?
That still leaves a gap.
Some OpenMP regressions are too semantic for an AST-only text diff to catch cleanly, but they are still much earlier than lowering.
A schedule clause can be parsed and unparsed correctly while still carrying the wrong defaulted meaning in the AST.
A target parallel for can look structurally fine while still missing a synthesized implicit map clause that later stages rely on.
Those are not parser failures. They are not basic AST-construction failures either. They are semantic-analysis failures.
REX’s answer to that gap is the focused driver checkOmpAnalyzing.C.
This post is about that layer alone: what it runs, what it checks, and why it deserves to sit between the broad frontend corpus and the later lowering tests.
Figure 1. checkOmpAnalyzing is narrower than the AST-only corpus and earlier than lowering. It exists to check semantic-analysis invariants directly on the OpenMP AST.
Why This Layer Exists Separately
The distinction from the previous post is crucial.
The broad OpenMP_tests corpus asks:
- did the frontend survive the input?
- did it build an OpenMP AST?
- did the unparser emit the expected directive-bearing lines?
That is a strong layer, but it still treats the AST mostly as something that eventually becomes output text.
checkOmpAnalyzing asks a different question:
- after the OpenMP semantic-analysis pass has run, does the AST now carry the specific semantic facts that later stages depend on?
That means it can catch failures such as:
- a default loop schedule not being represented explicitly enough,
- a dynamic schedule failing to acquire the right default chunk size,
- a Fortran
doloop not receiving the same default-schedule treatment as the Cforcase, - or a target offload construct failing to synthesize an implicit map entry for a referenced array.
Those failures are too semantic for a lightweight frontend output diff and too early to drag in lowering.
That is exactly why this layer exists.
What Actually Runs
The CMake registration is intentionally small and targeted.
First, the harness defines the analysis-mode flags:
| |
Then it registers four focused tests:
| |
That list already tells you what kind of layer this is.
This is not a wide corpus. It is a focused semantic regression harness. Each case exists because a specific semantic-analysis property matters to later passes.
The driver itself is also thin, but in a very different way from parseOmp.
It accepts a custom flag:
| |
It strips that flag before calling the standard ROSE frontend:
| |
Then it dispatches to one specific AST query routine:
| |
The important thing is what does not happen here:
- no
rose_*text diff, - no lowering,
- no runtime code generation,
- no execution.
This layer passes or fails based entirely on semantic assertions over the AST.
Figure 2. The analyzer harness is intentionally direct. Run semantic analysis, query the AST, assert one narrow property, and return a test result without involving lowering or runtime artifacts.
Why -rose:openmp:analyzing Matters
The flag choice here is the whole point of the layer.
-rose:openmp:ast_only from the previous post stops after building the OpenMP AST and then validates how that AST unparses.
-rose:openmp:analyzing goes further. It runs the OpenMP semantic-analysis stage that enriches or normalizes the AST in ways later passes depend on.
That means the AST being queried here is no longer just “the parser and constructor’s first draft.” It is the AST after analysis has had a chance to apply OpenMP-specific defaults and semantic interpretations.
That is exactly why checks like default schedule handling belong here instead of in the parser or AST-only layers.
The additional -rose:skipfinalCompileStep is also important. This layer does not care whether the backend would compile generated output. It cares only about whether the analyzed AST holds the right semantic facts. Skipping the final compile step keeps the feedback loop short and makes failures easier to attribute.
The Core Design: Direct AST Assertions
The strongest thing about checkOmpAnalyzing is that it does not translate its question into text if text is not the right comparison surface.
Instead, it queries the AST directly.
For example, the shared helper for schedule-default checks starts by finding all loop nodes of the relevant kind:
| |
Then, for each loop, it asks for the schedule clause:
| |
This is the right way to test semantic-analysis behavior.
If the question is “did analysis attach exactly one schedule clause with the right defaulted meaning?”, then the correct test surface is the AST node and its fields, not the unparsed string.
That is the key design principle of this layer:
ask the AST what it means, not what it happens to print.
Check 1: Default Schedule For C for
The first focused regression is schedule-default-for, driven by ompfor-default.c:
| |
Notice what is not present: there is no explicit schedule(...) clause in the source.
The analyzer check expects the semantic-analysis pass to make the default schedule behavior explicit in the AST as one SgOmpScheduleClause with:
- kind
static, - modifier1 unspecified,
- and no synthesized chunk size.
That is encoded directly in the driver:
| |
This is an excellent example of why the layer exists.
An AST-only output diff might still look acceptable even if the schedule semantics inside the AST were slightly wrong. But lowering and later analyses would care very much. So the semantic-analysis layer checks the AST fields directly.
Check 2: Default Schedule For Fortran do
The second schedule-default check is the Fortran analog, driven by fortran/ompdo-default.f:
| |
The test reuses the same helper but switches the queried node kind from V_SgOmpForStatement to V_SgOmpDoStatement.
That detail matters.
It shows that the semantic-analysis layer is not only checking “OpenMP in general.” It is checking that the language-unified OpenMP pipeline applies equivalent semantic defaults across the different AST forms used for different base languages.
This is exactly the kind of regression a compiler can accumulate quietly if it only tests one language deeply and the other superficially.
Check 3: Dynamic Schedule Chunk Synthesis
The third focused regression is schedule-dynamic-for, driven by ompfor4.c:
| |
Here the source does specify a schedule kind, but it does not specify a chunk size.
The analyzer check expects semantic analysis to synthesize the default chunk size 1 for dynamic or guided schedules:
| |
It also insists that the schedule modifiers remain unspecified.
This is a perfect semantic-analysis test case because it exercises a default that is meaningful to later stages but not explicit in the source text.
If the analyzer forgets this defaulting step, the bug may not show up clearly in a frontend output diff, but it will absolutely matter to code generation and execution.
Check 4: Implicit Target Map Synthesis
The fourth focused regression is different in character. It is not about schedule defaults; it is about implicit mapping semantics for target offloading.
The input analyzing_target_implicit_map.c is intentionally simple:
| |
There is no explicit map(...) clause in the source. The analyzer layer expects that semantic analysis will synthesize the implicit mapping information needed for the referenced array a.
The check works by finding SgOmpTargetParallelForStatement nodes and then inspecting their SgOmpMapClause contents:
| |
Then it walks the mapped expressions and verifies that a is present:
| |
Again, this is precisely the kind of property that is too semantic for a text-only frontend diff and too early to wait for lowering to reveal.
Why This Layer Is Narrow On Purpose
Compared with the hundreds of cases in the broad AST-only corpus, checkOmpAnalyzing looks tiny.
That is not a weakness. It is the whole design.
This layer is not trying to be a corpus. It is trying to be a set of high-value semantic assertions over analysis outcomes that are subtle enough to need direct AST inspection.
That is why the suite has only a few checks:
- each one is concrete,
- each one is cheap to run,
- and each one maps directly onto a semantic-analysis rule that later stages rely on.
In other words, this is not breadth testing. It is semantic spot-checking at exactly the right layer.
Figure 3. The analyzer layer stays narrow on purpose. Each --rex-check mode corresponds to one explicit semantic-analysis property over the OpenMP AST.
What This Layer Catches That The AST-Only Corpus Does Not
This boundary is worth stating directly.
The AST-only corpus asks:
- did the frontend build the AST?
- did the unparser emit the expected directive-bearing lines?
The analyzer layer asks:
- did semantic analysis enrich that AST with the correct defaults and implicit semantic facts?
That difference matters because some bugs are invisible or ambiguous at the text level:
- the AST may print a reasonable loop construct while still lacking the correct synthesized schedule information,
- a target construct may still unparse without the AST carrying the implicit map clause later passes need,
- a Fortran
doconstruct may still appear valid while differing subtly from the Cforcase in its analyzed state.
Those are exactly the failures checkOmpAnalyzing is designed to catch.
What This Layer Catches Earlier Than Lowering
It is equally important to understand why this layer exists before lowering.
Once lowering starts, semantic-analysis mistakes become harder to see in isolation because they are mixed with:
- outlining,
- runtime argument construction,
- helper emission,
- and all the structural rewriting that offloading introduces.
If a synthesized implicit map is missing, lowering may eventually fail or generate the wrong runtime code. But by then you are debugging too late.
checkOmpAnalyzing catches that missing map at the moment it matters most:
- after analysis should have created it,
- before lowering can hide the root cause behind secondary symptoms.
That is the exact advantage of a layered compiler test stack.
Why This Layer Fits REX’s Architecture So Well
This suite works well in REX because the OpenMP pipeline is already staged:
- preserved directive text,
OpenMPIR,SgOmp*,- semantic analysis,
- lowering,
- runtime glue,
- execution.
checkOmpAnalyzing is simply the test layer that matches one of those stages honestly.
It does not pretend AST construction and semantic analysis are the same thing. It does not wait for lowering to reveal earlier mistakes. It does not collapse semantic-analysis validation into a text-diff workflow that was designed for a different layer.
That is good compiler engineering.
The Real Value Of checkOmpAnalyzing
The deepest value of this layer is not that it has four tests. It is that it validates the compiler’s semantic-analysis stage in the way that stage deserves to be validated:
- direct AST inspection,
- narrow semantic questions,
- cheap execution,
- and failures that point to one stage instead of three.
That is what makes the suite valuable.
Parser tests are too early. AST-only output diffs are too textual. Lowering tests are later and noisier.
checkOmpAnalyzing occupies the exact middle ground for analysis-driven regressions:
small, sharp, and tied directly to semantic meaning inside the AST.
That is why this layer deserves its own post, and it is why the broader OpenMP test stack in REX makes more sense once you see this focused checkpoint sitting in the middle of it.