How REX Validates OpenMP Semantic Analysis With `checkOmpAnalyzing`

Posted on Apr 4, 2026 (Updated on Apr 16, 2026)

checkOmpAnalyzing is the narrow semantic-analysis layer inside REX’s OpenMP test stack. Unlike the broad OpenMP_tests AST-only corpus, it does not diff generated source. Unlike lowering tests, it does not inspect runtime artifacts. It runs the frontend with -rose:openmp:analyzing -rose:skipfinalCompileStep, then queries the constructed OpenMP AST directly and asserts semantic-analysis facts such as: default schedules are represented as schedule(static) without inventing modifiers or chunk sizes, dynamic or guided schedules synthesize chunk size 1, and target parallel for gains the expected implicit map clause. This makes it the cheapest layer for catching analysis drift that is too semantic for frontend output diffs but too early to involve lowering.

The previous post in this series covered the broad OpenMP_tests frontend corpus: parseOmp in ast_only mode, hundreds of C and C++ cases, a mixed OpenMP/OpenACC slice, and a separate Fortran slice, all aimed at one question:

can the frontend build and unparse OpenMP ASTs correctly across a large corpus?

That still leaves a gap.

Some OpenMP regressions are too semantic for an AST-only text diff to catch cleanly, but they are still much earlier than lowering.

A schedule clause can be parsed and unparsed correctly while still carrying the wrong defaulted meaning in the AST.

A target parallel for can look structurally fine while still missing a synthesized implicit map clause that later stages rely on.

Those are not parser failures. They are not basic AST-construction failures either. They are semantic-analysis failures.

REX’s answer to that gap is the focused driver checkOmpAnalyzing.C.

This post is about that layer alone: what it runs, what it checks, and why it deserves to sit between the broad frontend corpus and the later lowering tests.

A test-stack diagram placing the focused semantic-analysis layer between the broad AST-only frontend corpus and the later lowering-oriented layers. — Figure 1. `checkOmpAnalyzing` is narrower than the AST-only corpus and earlier than lowering. It exists to check semantic-analysis invariants directly on the OpenMP AST.

Why This Layer Exists Separately

The distinction from the previous post is crucial.

The broad OpenMP_tests corpus asks:

did the frontend survive the input?
did it build an OpenMP AST?
did the unparser emit the expected directive-bearing lines?

That is a strong layer, but it still treats the AST mostly as something that eventually becomes output text.

checkOmpAnalyzing asks a different question:

after the OpenMP semantic-analysis pass has run, does the AST now carry the specific semantic facts that later stages depend on?

That means it can catch failures such as:

a default loop schedule not being represented explicitly enough,
a dynamic schedule failing to acquire the right default chunk size,
a Fortran do loop not receiving the same default-schedule treatment as the C for case,
or a target offload construct failing to synthesize an implicit map entry for a referenced array.

Those failures are too semantic for a lightweight frontend output diff and too early to drag in lowering.

That is exactly why this layer exists.

What Actually Runs

The CMake registration is intentionally small and targeted.

First, the harness defines the analysis-mode flags:

1
2
3
4
5
set(ROSE_OMP_ANALYZING_FLAGS
    -rose:openmp:analyzing
    -rose:skipfinalCompileStep
    -w
    -rose:verbose 0)

Then it registers four focused tests:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
add_test(
  NAME OMPANALYZE_schedule_default_for
  COMMAND checkOmpAnalyzing
          --rex-check=schedule-default-for
          ${ROSE_OMP_ANALYZING_FLAGS}
          -c ${CMAKE_CURRENT_SOURCE_DIR}/ompfor-default.c)

add_test(
  NAME OMPANALYZE_schedule_dynamic_for
  COMMAND checkOmpAnalyzing
          --rex-check=schedule-dynamic-for
          ${ROSE_OMP_ANALYZING_FLAGS}
          -c ${CMAKE_CURRENT_SOURCE_DIR}/ompfor4.c)

add_test(
  NAME OMPANALYZE_schedule_default_do
  COMMAND checkOmpAnalyzing
          --rex-check=schedule-default-do
          ${ROSE_OMP_ANALYZING_FLAGS}
          -c ${CMAKE_CURRENT_SOURCE_DIR}/fortran/ompdo-default.f)

add_test(
  NAME OMPANALYZE_target_implicit_map
  COMMAND checkOmpAnalyzing
          --rex-check=target-implicit-map
          ${ROSE_OMP_ANALYZING_FLAGS}
          -c ${CMAKE_CURRENT_SOURCE_DIR}/analyzing_target_implicit_map.c)

That list already tells you what kind of layer this is.

This is not a wide corpus. It is a focused semantic regression harness. Each case exists because a specific semantic-analysis property matters to later passes.

The driver itself is also thin, but in a very different way from parseOmp.

It accepts a custom flag:

1
2
3
4
if (std::strncmp(arg, "--rex-check=", 12) == 0) {
  check_kind = arg + 12;
  continue;
}

It strips that flag before calling the standard ROSE frontend:

1
2
3
SgProject *project = frontend(rose_argc, rose_args.data());
ROSE_ASSERT(project != nullptr);
AstTests::runAllTests(project);

Then it dispatches to one specific AST query routine:

1
2
3
4
5
6
7
8
9
if (check_kind == "schedule-default-for") {
  ok = checkScheduleDefaults(project, V_SgOmpForStatement);
} else if (check_kind == "schedule-default-do") {
  ok = checkScheduleDefaults(project, V_SgOmpDoStatement);
} else if (check_kind == "schedule-dynamic-for") {
  ok = checkDynamicChunkDefault(project, V_SgOmpForStatement);
} else if (check_kind == "target-implicit-map") {
  ok = checkTargetImplicitMap(project);
}

The important thing is what does not happen here:

no rose_* text diff,
no lowering,
no runtime code generation,
no execution.

This layer passes or fails based entirely on semantic assertions over the AST.

A diagram showing checkOmpAnalyzing consuming a targeted source file and a custom --rex-check mode, running the frontend in analyzing mode, querying the AST, and returning success or failure based on semantic assertions. — Figure 2. The analyzer harness is intentionally direct. Run semantic analysis, query the AST, assert one narrow property, and return a test result without involving lowering or runtime artifacts.

Why `-rose:openmp:analyzing` Matters

The flag choice here is the whole point of the layer.

-rose:openmp:ast_only from the previous post stops after building the OpenMP AST and then validates how that AST unparses.

-rose:openmp:analyzing goes further. It runs the OpenMP semantic-analysis stage that enriches or normalizes the AST in ways later passes depend on.

That means the AST being queried here is no longer just “the parser and constructor’s first draft.” It is the AST after analysis has had a chance to apply OpenMP-specific defaults and semantic interpretations.

That is exactly why checks like default schedule handling belong here instead of in the parser or AST-only layers.

The additional -rose:skipfinalCompileStep is also important. This layer does not care whether the backend would compile generated output. It cares only about whether the analyzed AST holds the right semantic facts. Skipping the final compile step keeps the feedback loop short and makes failures easier to attribute.

The Core Design: Direct AST Assertions

The strongest thing about checkOmpAnalyzing is that it does not translate its question into text if text is not the right comparison surface.

Instead, it queries the AST directly.

For example, the shared helper for schedule-default checks starts by finding all loop nodes of the relevant kind:

1
2
Rose_STL_Container<SgNode *> loops =
    NodeQuery::querySubTree(project, variant);

Then, for each loop, it asks for the schedule clause:

1
2
Rose_STL_Container<SgOmpClause *> clauses =
    getClause(target, V_SgOmpScheduleClause);

This is the right way to test semantic-analysis behavior.

If the question is “did analysis attach exactly one schedule clause with the right defaulted meaning?”, then the correct test surface is the AST node and its fields, not the unparsed string.

That is the key design principle of this layer:

ask the AST what it means, not what it happens to print.

Check 1: Default Schedule For C `for`

The first focused regression is schedule-default-for, driven by ompfor-default.c:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
int i, j;
#pragma omp parallel
{
#pragma omp single
  printf("Using %d threads.\n", omp_get_num_threads());
#pragma omp for private(j)
  for (i = 0; i < 10; i++) {
    ...
  }
}

Notice what is not present: there is no explicit schedule(...) clause in the source.

The analyzer check expects the semantic-analysis pass to make the default schedule behavior explicit in the AST as one SgOmpScheduleClause with:

kind static,
modifier1 unspecified,
and no synthesized chunk size.

That is encoded directly in the driver:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
if (schedule->get_kind() != SgOmpClause::e_omp_schedule_kind_static) {
  std::cerr << "Expected default schedule(static).\n";
  return false;
}
if (schedule->get_modifier1() !=
    SgOmpClause::e_omp_schedule_modifier_unspecified) {
  std::cerr << "Default schedule modifiers must stay unspecified.\n";
  return false;
}
if (schedule->get_chunk_size() != nullptr) {
  std::cerr << "Default schedule(static) must not synthesize chunk size.\n";
  return false;
}

This is an excellent example of why the layer exists.

An AST-only output diff might still look acceptable even if the schedule semantics inside the AST were slightly wrong. But lowering and later analyses would care very much. So the semantic-analysis layer checks the AST fields directly.

Check 2: Default Schedule For Fortran `do`

The second schedule-default check is the Fortran analog, driven by fortran/ompdo-default.f:

1
2
3
4
5
6
      integer i, j
!$omp parallel do private(j)
      do i = 1, 10
        j = omp_get_thread_num()
        print *, "Iteration ", i, " by thread:", j
      enddo

The test reuses the same helper but switches the queried node kind from V_SgOmpForStatement to V_SgOmpDoStatement.

That detail matters.

It shows that the semantic-analysis layer is not only checking “OpenMP in general.” It is checking that the language-unified OpenMP pipeline applies equivalent semantic defaults across the different AST forms used for different base languages.

This is exactly the kind of regression a compiler can accumulate quietly if it only tests one language deeply and the other superficially.

Check 3: Dynamic Schedule Chunk Synthesis

The third focused regression is schedule-dynamic-for, driven by ompfor4.c:

1
2
3
4
#pragma omp for schedule(dynamic)
for (i = lower; i > upper; i -= stride) {
  ...
}

Here the source does specify a schedule kind, but it does not specify a chunk size.

The analyzer check expects semantic analysis to synthesize the default chunk size 1 for dynamic or guided schedules:

1
2
3
4
if (!isIntegerOne(schedule->get_chunk_size())) {
  std::cerr << "Expected synthesized chunk size 1 for dynamic/guided.\n";
  return false;
}

It also insists that the schedule modifiers remain unspecified.

This is a perfect semantic-analysis test case because it exercises a default that is meaningful to later stages but not explicit in the source text.

If the analyzer forgets this defaulting step, the bug may not show up clearly in a frontend output diff, but it will absolutely matter to code generation and execution.

Check 4: Implicit Target Map Synthesis

The fourth focused regression is different in character. It is not about schedule defaults; it is about implicit mapping semantics for target offloading.

The input analyzing_target_implicit_map.c is intentionally simple:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
int main(void) {
  int i;
  int n = 32;
  int a[32];

#pragma omp target parallel for
  for (i = 0; i < n; ++i) {
    a[i] = a[i] + n;
  }
}

There is no explicit map(...) clause in the source. The analyzer layer expects that semantic analysis will synthesize the implicit mapping information needed for the referenced array a.

The check works by finding SgOmpTargetParallelForStatement nodes and then inspecting their SgOmpMapClause contents:

1
2
3
4
5
Rose_STL_Container<SgNode *> targets =
    NodeQuery::querySubTree(project, V_SgOmpTargetParallelForStatement);
...
Rose_STL_Container<SgOmpClause *> map_clauses =
    getClause(target, V_SgOmpMapClause);

Then it walks the mapped expressions and verifies that a is present:

1
2
3
4
if (!mapped_array_a) {
  std::cerr << "Expected implicit map clause to include array 'a'.\n";
  return false;
}

Again, this is precisely the kind of property that is too semantic for a text-only frontend diff and too early to wait for lowering to reveal.

Why This Layer Is Narrow On Purpose

Compared with the hundreds of cases in the broad AST-only corpus, checkOmpAnalyzing looks tiny.

That is not a weakness. It is the whole design.

This layer is not trying to be a corpus. It is trying to be a set of high-value semantic assertions over analysis outcomes that are subtle enough to need direct AST inspection.

That is why the suite has only a few checks:

each one is concrete,
each one is cheap to run,
and each one maps directly onto a semantic-analysis rule that later stages rely on.

In other words, this is not breadth testing. It is semantic spot-checking at exactly the right layer.

A matrix mapping each checkOmpAnalyzing mode to its input file, queried AST node kind, and asserted semantic property. — Figure 3. The analyzer layer stays narrow on purpose. Each `--rex-check` mode corresponds to one explicit semantic-analysis property over the OpenMP AST.

What This Layer Catches That The AST-Only Corpus Does Not

This boundary is worth stating directly.

The AST-only corpus asks:

did the frontend build the AST?
did the unparser emit the expected directive-bearing lines?

The analyzer layer asks:

did semantic analysis enrich that AST with the correct defaults and implicit semantic facts?

That difference matters because some bugs are invisible or ambiguous at the text level:

the AST may print a reasonable loop construct while still lacking the correct synthesized schedule information,
a target construct may still unparse without the AST carrying the implicit map clause later passes need,
a Fortran do construct may still appear valid while differing subtly from the C for case in its analyzed state.

Those are exactly the failures checkOmpAnalyzing is designed to catch.

What This Layer Catches Earlier Than Lowering

It is equally important to understand why this layer exists before lowering.

Once lowering starts, semantic-analysis mistakes become harder to see in isolation because they are mixed with:

outlining,
runtime argument construction,
helper emission,
and all the structural rewriting that offloading introduces.

If a synthesized implicit map is missing, lowering may eventually fail or generate the wrong runtime code. But by then you are debugging too late.

checkOmpAnalyzing catches that missing map at the moment it matters most:

after analysis should have created it,
before lowering can hide the root cause behind secondary symptoms.

That is the exact advantage of a layered compiler test stack.

Why This Layer Fits REX’s Architecture So Well

This suite works well in REX because the OpenMP pipeline is already staged:

preserved directive text,
OpenMPIR,
SgOmp*,
semantic analysis,
lowering,
runtime glue,
execution.

checkOmpAnalyzing is simply the test layer that matches one of those stages honestly.

It does not pretend AST construction and semantic analysis are the same thing. It does not wait for lowering to reveal earlier mistakes. It does not collapse semantic-analysis validation into a text-diff workflow that was designed for a different layer.

That is good compiler engineering.

The Real Value Of `checkOmpAnalyzing`

The deepest value of this layer is not that it has four tests. It is that it validates the compiler’s semantic-analysis stage in the way that stage deserves to be validated:

direct AST inspection,
narrow semantic questions,
cheap execution,
and failures that point to one stage instead of three.

That is what makes the suite valuable.

Parser tests are too early. AST-only output diffs are too textual. Lowering tests are later and noisier.

checkOmpAnalyzing occupies the exact middle ground for analysis-driven regressions:

small, sharp, and tied directly to semantic meaning inside the AST.

That is why this layer deserves its own post, and it is why the broader OpenMP test stack in REX makes more sense once you see this focused checkpoint sitting in the middle of it.