Skip to content

Fix missing private clauses in 3D viscous GPU loops#1225

Draft
sbryngelson wants to merge 8 commits intoMFlowCode:masterfrom
sbryngelson:fix/viscous-3d-gpu-private
Draft

Fix missing private clauses in 3D viscous GPU loops#1225
sbryngelson wants to merge 8 commits intoMFlowCode:masterfrom
sbryngelson:fix/viscous-3d-gpu-private

Conversation

@sbryngelson
Copy link
Member

@sbryngelson sbryngelson commented Feb 21, 2026

User description

Summary

  • The 3D shear stress and bulk stress GPU_PARALLEL_LOOP directives were missing rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum from their private clauses
  • The corresponding 2D loops already included these variables
  • Without privatization, these variables can cause race conditions when running on GPU

Test plan

  • Verify 3D viscous test cases produce correct results on GPU
  • No golden file changes expected (fixes GPU race condition, CPU results unchanged)

🤖 Generated with Claude Code


CodeAnt-AI Description

Fix GPU data races in 3D viscous loops by privatizing loop indices and temporaries

What Changed

  • The 3D shear- and bulk-stress GPU parallel regions now privately scope the loop indices (i, j, k, l, q) and per-thread viscous temporaries (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum and related temporaries), preventing those variables from being shared across GPU threads
  • Privatization was added consistently across all affected 3D viscous loops so GPU runs do not suffer from cross-thread interference
  • CPU execution and numerical results remain unchanged

Impact

✅ Correct 3D viscous results on GPU
✅ Fewer GPU race-condition artifacts in viscous simulations
✅ No change to CPU outputs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@codeant-ai codeant-ai bot added the size:XS This PR changes 0-9 lines, ignoring generated files label Feb 21, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

@codeant-ai

This comment has been minimized.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a GPU data-race hazard in the 3D viscous stress kernels by aligning their GPU_PARALLEL_LOOP privatization with the already-correct 2D implementations.

Changes:

  • Add missing mixture variables (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) to the private clauses for the 3D shear-stress and bulk-stress GPU parallel loops.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/simulation/m_viscous.fpp (1)

321-321: LGTM — consider adding q to the private clause to be fully explicit

The added variables (i, j, k, l, rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) correctly mirror the 2D shear-stress loop at line 104 and fix the GPU race condition.

The integer q (declared at line 83) is used as the induction variable of the inner do q = 1, Re_size(i) seq sub-loop (lines 392–396). While OpenACC treats subroutine-local scalars as implicitly private in a parallel region, the project guideline calls for all loop-local variables to be listed explicitly. The same gap exists in the 2D counterpart at line 104.

💡 Suggested addition
-$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

(The same addition should be applied to the 2D shear/bulk loops at lines 104 and 214, and to the 3D bulk loop at line 430.)

Based on learnings: "Ensure private(...) declarations on all loop-local variables in GPU-accelerated code to prevent unintended data sharing."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulation/m_viscous.fpp` at line 321, Add the induction variable q to
the private clause of the GPU_PARALLEL_LOOP directives to explicitly mark it as
loop-local: update the existing GPU_PARALLEL_LOOP(collapse=3, private='[...]')
that governs the 3D shear-stress loop (and the analogous GPU_PARALLEL_LOOP
directives used for the 2D shear and bulk loops and the 3D bulk loop) so that
'q' is included among the private variables alongside
i,j,k,l,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum,alpha_visc,alpha_rho_visc,Re_visc,tau_Re;
this ensures the inner "do q = 1, Re_size(i)" seq sub-loop uses a private
induction variable and satisfies the project guideline for explicit loop-local
private declarations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/simulation/m_viscous.fpp`:
- Line 430: The OpenACC/OpenMP GPU_PARALLEL_LOOP directive missing the
inner-loop induction variable causes incorrect privatization; update the
directive that currently lists private='[i,j,k,l,rho_visc, gamma_visc,
pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]' to
also include the inner seq-loop induction variable q so it matches the 2D
bulk-stress loop’s private clause and the inner seq loop that uses q (the inner
loop around lines 501–505 referenced in the review).

---

Nitpick comments:
In `@src/simulation/m_viscous.fpp`:
- Line 321: Add the induction variable q to the private clause of the
GPU_PARALLEL_LOOP directives to explicitly mark it as loop-local: update the
existing GPU_PARALLEL_LOOP(collapse=3, private='[...]') that governs the 3D
shear-stress loop (and the analogous GPU_PARALLEL_LOOP directives used for the
2D shear and bulk loops and the 3D bulk loop) so that 'q' is included among the
private variables alongside
i,j,k,l,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum,alpha_visc,alpha_rho_visc,Re_visc,tau_Re;
this ensures the inner "do q = 1, Re_size(i)" seq sub-loop uses a private
induction variable and satisfies the project guideline for explicit loop-local
private declarations.

@codecov
Copy link

codecov bot commented Feb 22, 2026

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 44.03%. Comparing base (ab5082e) to head (1daf95b).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
src/common/m_phase_change.fpp 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1225      +/-   ##
==========================================
- Coverage   44.04%   44.03%   -0.01%     
==========================================
  Files          70       70              
  Lines       20499    20498       -1     
  Branches     1993     1994       +1     
==========================================
- Hits         9028     9026       -2     
  Misses      10330    10330              
- Partials     1141     1142       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 22, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/simulation/m_viscous.fpp (1)

105-105: Minor: trailing space before comma in alpha_visc_sum ,alpha_visc.

Lines 105 and 215 have alpha_visc_sum ,alpha_visc (space before the comma), which is inconsistent with the standard alpha_visc_sum, alpha_visc formatting used in the corresponding 3D blocks at lines 322 and 431.

✏️ Proposed fix (both occurrences)
-$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum ,alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

Also applies to: 215-215

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulation/m_viscous.fpp` at line 105, Fix the minor formatting typo in
the GPU_PARALLEL_LOOP private list: remove the stray space before the comma so
the variables read "alpha_visc_sum, alpha_visc" (consistent with the 3D blocks);
update both occurrences where "alpha_visc_sum ,alpha_visc" appears (lines around
the GPU_PARALLEL_LOOP uses) so the private clause lists "alpha_visc_sum,
alpha_visc" exactly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/simulation/m_viscous.fpp`:
- Line 105: Fix the minor formatting typo in the GPU_PARALLEL_LOOP private list:
remove the stray space before the comma so the variables read "alpha_visc_sum,
alpha_visc" (consistent with the 3D blocks); update both occurrences where
"alpha_visc_sum ,alpha_visc" appears (lines around the GPU_PARALLEL_LOOP uses)
so the private clause lists "alpha_visc_sum, alpha_visc" exactly.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e892dc and 164cda3.

📒 Files selected for processing (1)
  • src/simulation/m_viscous.fpp

@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from 164cda3 to 9b3ee97 Compare February 23, 2026 14:48
@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 23, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 23, 2026
@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch 2 times, most recently from 6ba8b18 to dd51d68 Compare February 24, 2026 16:03
@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 24, 2026
@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from 4893bc3 to 69b0e93 Compare February 24, 2026 16:49
danieljvickers
danieljvickers previously approved these changes Feb 24, 2026
@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 24, 2026
Copy link
Contributor

@wilfonba wilfonba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from coderabbitai bot Feb 26, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7658b43 and dca6c2f.

📒 Files selected for processing (1)
  • src/simulation/m_viscous.fpp

📝 Walkthrough

Walkthrough

The changes modify src/simulation/m_viscous.fpp to expand private variable lists within GPU_PARALLEL_LOOP blocks, adding loop variables such as q and i/j/k/l to improve variable scoping in parallel regions. Additionally, several inline comments are corrected for spelling and grammar, including fixes like "diriviatives" to "derivatives," "spacial" to "spatial," and "equaiont" to "equation." The total modifications amount to six lines added and six removed, with no alterations to exported or public entity declarations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Whiskers twitch with giddy delight!
Variables scoped, GPU loops burning bright,
"Derivatives" spelled right, no more "spacial" mix,
Comments all tidied—a grammatical fix!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description includes detailed explanation of the issue and fix, but lacks explicit completion of required template sections like type of change checkbox, testing confirmation, and GPU checklist items. Complete the required template sections: mark the type of change (Bug fix), confirm testing method for GPU results, and check the GPU-specific testing checklist items.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding missing private clauses to 3D viscous GPU loops to fix data race conditions.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 26, 2026
sbryngelson and others added 5 commits February 28, 2026 14:44
The 3D shear stress and bulk stress GPU parallel loops were missing
rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum from their
private clauses. The corresponding 2D loops already had these
variables listed. Without privatization, these variables could cause
race conditions on GPU.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add i,j,k,l to the private list for the 3D shear_stress and
bulk_stress GPU parallel loops, matching the pattern already used
by the analogous 2D loops at lines 105 and 215.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The sequential loop iterator q (used in `do q = 1, Re_size(i)`) was not
privatized in any of the four GPU parallel regions. Without explicit
privatization, q is shared across GPU threads on OpenACC and AMD OpenMP
backends, causing a data race in Reynolds number computation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from dca6c2f to 45b31d9 Compare February 28, 2026 19:44
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 28, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 28, 2026
Replace matmul() with explicit 2x2 matrix-vector multiply to avoid
an intermittent CCE 19.0.0 internal compiler error (signal 13 segfault
in optcg) when matmul is used inside a subroutine marked with both
!$acc routine seq and !DIR$ INLINEALWAYS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 1, 2026

Claude Code Review

Head SHA: a2c4d1d

Files changed: 2

  • src/simulation/m_viscous.fpp
  • src/common/m_phase_change.fpp

Summary

  • Adds missing rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, and q to the private clauses of the 3D shear/bulk stress GPU_PARALLEL_LOOP directives in m_viscous.fpp, fixing genuine GPU race conditions in 3D viscous simulations.
  • Adds q to the private clauses of the corresponding 2D loops (lines 102, 212) for consistency and correctness.
  • Replaces a matmul intrinsic call with explicit 2x2 element-wise computation in m_phase_change.fpp.
  • Fixes three comment typos: diriviativesderivatives, equaiontequation, spacialspatial.

Findings

[MINOR] m_phase_change.fpp:580-581 — Undocumented change, needs explanation

The replacement of:

DeltamP = -1.0_wp*(matmul(InvJac, R2D))

with explicit element-wise 2x2 matrix-vector multiplication is mathematically correct but is not mentioned in the PR description. This change likely fixes a GPU-device-code incompatibility (some compilers/backends do not support matmul inside GPU_ROUTINE or GPU kernel contexts), but the motivation should be documented. Since m_phase_change.fpp is in src/common/, this change affects all three executables. Please confirm this is intentional and add a note to the PR description.

[CONFIRM] 3D loop fix is correct and matches 2D

The fix at lines 319 and 428 of m_viscous.fpp aligns the 3D private clause with what the 2D loops already had. The collapse=3 over (l, k, j) is correct for the loop structure. The added variables are genuine per-thread temporaries (computed from per-cell alpha arrays) that must be private to avoid data races on GPU.

[TRIVIAL] Trailing space removed from private clause

At lines 102 and 212, the old clause had alpha_visc_sum ,alpha_visc (space before comma). The fix also cleans this up — no correctness issue.


Overall

The core race-condition fix is correct and well-motivated. The only item worth addressing before merge is documenting the matmul → explicit replacement in m_phase_change.fpp.

The do while condition checked abs(FT) before FT was ever assigned.
Fortran .or. is not short-circuit, so abs(FT) was always evaluated
on the first iteration even though (ns == 0) guaranteed loop entry.

Restructure to do...exit: compute FT first, then check convergence.
Logically identical, no uninitialized reads, no intrinsics, safe on
all GPU device compilers (exit maps to a plain conditional branch).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 1, 2026

Claude Code Review

Head SHA: 1daf95b8873ce2ac1652a7ef8ecc857c75508ba5

Files changed: 2

  • src/simulation/m_viscous.fpp (+6/-6)
  • src/common/m_phase_change.fpp (+4/-2)

Summary

  • Adds missing loop indices (i,j,k,l,q) and viscous temporaries (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) to the private clause of 3D GPU parallel loops in m_viscous.fpp — a real GPU race-condition fix.
  • Adds q to the existing private clause of the 2D loops and removes a stray space (cosmetic).
  • Replaces matmul with explicit element-wise 2×2 arithmetic in m_phase_change.fpp.
  • Restructures a do while loop to do … exit in m_phase_change.fpp.
  • Two comment typo fixes (diriviativesderivatives, spacialspatial, equaiontequation).

Findings

[Medium] m_phase_change.fpp changes are undescribed and in src/common/

src/common/m_phase_change.fpp is shared by all three executables (pre_process, simulation, post_process). The PR description covers only m_viscous.fpp. The two m_phase_change changes need rationale and require testing all three targets per project rules.

[Low] matmul → explicit arithmetic needs rationale (src/common/m_phase_change.fpp ~L582)

The arithmetic is correct for a 2×2 system. However, the motivation is unexplained. If this is because matmul is not supported in GPU device code (e.g., inside an OpenACC/OpenMP routine), that should be documented with a comment. If it is a standalone correctness fix, the PR description should say so.

[Low] do whiledo … exit restructuring (src/common/m_phase_change.fpp ~L717)

The semantics are equivalent (the ns == 0 guard in the original only ensured one mandatory iteration, which the new do … exit also guarantees). The change is acceptable but unexplained in the PR body and should be noted.


Improvement Opportunities

  • The 3D private-clause fix is the right approach, but it would be worth a follow-up audit of all other GPU_PARALLEL_LOOP sites in m_viscous.fpp to confirm no other loops are missing loop-index or scalar-temporary privatization.
  • A 3D GPU viscous regression test (e.g., 3D channel flow or Couette flow with Re > 0) should be explicitly called out in the test plan to prevent future regressions of this class of bug.

@github-actions
Copy link

github-actions bot commented Mar 1, 2026

Claude Code Review

Head SHA: 1daf95b8873ce2ac1652a7ef8ecc857c75508ba5

Files changed: 2

  • src/simulation/m_viscous.fpp (+6/-6)
  • src/common/m_phase_change.fpp (+4/-2)

Summary

  • Adds missing loop indices (i,j,k,l,q) and viscous temporaries (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) to the private clause of 3D GPU parallel loops in m_viscous.fpp — a real GPU race-condition fix.
  • Adds q to the existing private clause of the 2D loops and removes a stray space (cosmetic).
  • Replaces matmul with explicit element-wise 2×2 arithmetic in m_phase_change.fpp.
  • Restructures a do while loop to do … exit in m_phase_change.fpp.
  • Two comment typo fixes (diriviativesderivatives, spacialspatial, equaiontequation).

Findings

[Medium] m_phase_change.fpp changes are undescribed and in src/common/

src/common/m_phase_change.fpp is shared by all three executables (pre_process, simulation, post_process). The PR description covers only m_viscous.fpp. The two m_phase_change changes need rationale and require testing all three targets per project rules.

[Low] matmul → explicit arithmetic needs rationale (src/common/m_phase_change.fpp ~L582)

The arithmetic is correct for a 2×2 system. However, the motivation is unexplained. If this is because matmul is not supported in GPU device code (e.g., inside an OpenACC/OpenMP routine), that should be documented with a comment. If it is a standalone correctness fix, the PR description should say so.

[Low] do whiledo … exit restructuring (src/common/m_phase_change.fpp ~L717)

The semantics are equivalent (the ns == 0 guard in the original only ensured one mandatory iteration, which the new do … exit also guarantees since the exit is at the end of the loop body). The change is acceptable but unexplained in the PR body and should be noted.


Improvement Opportunities

  • The 3D private-clause fix is correct, but a follow-up audit of all other GPU_PARALLEL_LOOP sites in m_viscous.fpp would confirm no other loops have missing index or scalar-temporary privatization.
  • A 3D GPU viscous regression test (e.g., 3D channel flow or Couette flow with Re > 0) should be explicitly called out in the test plan to prevent future regressions of this class of bug.

@sbryngelson sbryngelson marked this pull request as draft March 3, 2026 02:44
@sbryngelson sbryngelson marked this pull request as ready for review March 11, 2026 19:13
@qodo-code-review
Copy link
Contributor

Review Summary by Qodo

Fix GPU data races in 3D viscous loops and phase change module

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Fix GPU data races in 3D viscous loops by privatizing loop indices and temporaries
• Add missing q iterator to private clauses in all four GPU parallel regions
• Remove errant space before comma in private clause lists
• Fix typos in comments and add convergence check in phase change module
Diagram
flowchart LR
  A["3D Viscous GPU Loops"] -->|"Add i,j,k,l,q to private"| B["Prevent Race Conditions"]
  C["Phase Change Module"] -->|"Add convergence check"| D["Improve Stability"]
  E["Comment Typos"] -->|"Fix spelling"| F["Code Quality"]
  B --> G["Correct GPU Results"]
  D --> G
Loading

Grey Divider

File Changes

1. src/common/m_phase_change.fpp 🐞 Bug fix +1/-0

Add convergence check in saturation temperature loop

• Add convergence check if (abs(FT) <= ptgalpha_eps) exit to Newton-Raphson loop for saturation
 temperature calculation
• Prevents infinite loops or excessive iterations when convergence criterion is met

src/common/m_phase_change.fpp


2. src/simulation/m_viscous.fpp 🐞 Bug fix +6/-6

Privatize loop indices and temporaries in GPU viscous loops

• Add missing loop iterator q to private clauses in four GPU parallel regions (lines 106, 216,
 323, 432)
• Remove errant space before comma in private clause lists for consistency
• Add missing loop indices i,j,k,l and viscous temporaries to 3D GPU loops at lines 323 and 432
• Fix typos in comments: "diriviatives" → "derivatives", "equaiont" → "equation", "spacial" →
 "spatial"

src/simulation/m_viscous.fpp


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Mar 11, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Remediation recommended

1. Unvalidated TSat at exit 🐞 Bug ✓ Correctness
Description
In s_TSat, the new exit condition uses FT computed before updating TSat, but executes after
updating TSat, so the loop can exit without ever checking the residual at the returned TSat
value. This also makes the new exit effectively redundant with the `do while ((abs(FT) >
ptgalpha_eps) .or. (ns == 0))` guard.
Code

src/common/m_phase_change.fpp[757]

+                if (abs(FT) <= ptgalpha_eps) exit
Evidence
The loop condition already terminates when abs(FT) <= ptgalpha_eps, and FT is computed prior to
the TSat update. The newly-added exit is placed after the TSat update but still checks the old
FT, meaning convergence is not evaluated at the returned TSat.

src/common/m_phase_change.fpp[736-758]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The Newton iteration in `s_TSat` updates `TSat` and then exits based on `FT` computed for the *previous* `TSat`. This means the returned `TSat` is not validated against `ptgalpha_eps` at the point of exit, and the added `exit` is redundant with the `do while` guard.

### Issue Context
- `FT` is computed using the current `TSat`.
- `TSat` is then updated using `FT/dFdT`.
- The new `exit` checks `abs(FT)` after the update, but `FT` still corresponds to the pre-update state.

### Fix Focus Areas
- src/common/m_phase_change.fpp[736-758]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@github-actions
Copy link

Claude Code Review

Head SHA: ebbdeb3

Files changed: 2

  • src/simulation/m_viscous.fpp (+6/-6)
  • src/common/m_phase_change.fpp (+1/0)

Summary

  • Fixes GPU race conditions in 3D viscous shear/bulk stress loops by adding missing i,j,k,l,q,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum to private clauses (lines ~319, ~428 in m_viscous.fpp)
  • Also adds missing q to the existing private clauses in the 2D viscous loops (lines ~102, ~212)
  • Adds a convergence check to the Newton iteration in m_phase_change.fpp
  • Fixes comment typos (diriviativesderivatives, spacialspatial, equaiontequation)

Findings

src/simulation/m_viscous.fpp — lines 319, 428 (3D loops): The core fix is correct. The 3D num_dims > 2 loops previously declared only [alpha_visc, alpha_rho_visc, Re_visc, tau_Re] as private, leaving rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum as shared, causing data races on GPU. The new private list now matches the 2D equivalents. Fix looks correct.

src/simulation/m_viscous.fpp — lines 102, 212 (2D loops): Adding q to the existing private clause is correct if q is used as a loop iterator in the body of those loops. Consistent with the iterator-explicit-privatization pattern used throughout the file.

src/common/m_phase_change.fpp — line 757 (Newton iteration early exit):

This adds a convergence check after the Newton update TSat = TSat - Om*FT/dFdT, which is the correct placement. Without this, the loop ran to its maximum iteration count regardless of convergence, wasting work and potentially overshooting.

However, this change is undocumented in the PR description — the summary mentions only viscous GPU fixes. Since m_phase_change.fpp is in src/common/ (shared by all three executables), this change has wider blast radius and affects phase-change simulation results (saturation temperature convergence behavior changes). It should be documented and potentially tested separately. Please confirm:

  1. Is ptgalpha_eps the intended tolerance variable (verify it is in-scope and initialized correctly at the point of this check)?
  2. Were phase-change test cases verified after this change?

No other issues found. The viscous GPU private-clause fixes are correct and well-targeted.

@github-actions
Copy link

Claude Code Review

Head SHA: ebbdeb3

Files changed: 2

  • src/simulation/m_viscous.fpp
  • src/common/m_phase_change.fpp

Summary

  • Fixes genuine GPU race conditions in 3D viscous (shear + bulk stress) loops by adding missing variables to private clauses
  • Also fixes 2D loops which were missing q from their private clauses
  • Fixes two comment typos (diriviativesderivatives, spacialspatial, equaiontequation)
  • Contains one unrelated change in m_phase_change.fpp not described in the PR

Findings

[CORRECT] src/simulation/m_viscous.fpp lines ~319, ~428 (3D loops)

The 3D shear and bulk stress loops had severely incomplete private clauses — only [alpha_visc, alpha_rho_visc, Re_visc, tau_Re]. This omitted loop indices i,j,k,l,q and per-thread temporaries rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum. These are genuine race conditions on GPU. The fix is correct and consistent with the 2D loops.

[CORRECT] src/simulation/m_viscous.fpp lines ~102, ~212 (2D loops)

The 2D loops already had most variables but were missing q. The fix correctly adds it. Also fixes a spurious space in alpha_visc_sum ,alpha_viscalpha_visc_sum, alpha_visc.

[CONCERN] src/common/m_phase_change.fpp line ~757

This adds an early-exit convergence check to what appears to be a Newton iteration loop. The change is logically sound (exit when function value is below tolerance), but it is:

  1. Completely unrelated to the GPU private-clause fix described in the PR
  2. Not mentioned anywhere in the PR description or test plan
  3. In src/common/ which affects all three executables

This should either be moved to a separate PR with its own description and test coverage, or explicitly justified in this PR. As-is it's untraceable from the PR description and harder to bisect if it causes issues.


Minor

The PR description says only rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum were missing from 3D loops, but the fix also adds i,j,k,l,q (loop indices). The description is incomplete — loop indices being private is equally important for correctness. Not a code issue, just documentation accuracy.

@sbryngelson sbryngelson marked this pull request as draft March 11, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

4 participants