Skip to content

feat: infer gh CLI permissions for activation job pre-steps#32849

Merged
pelikhan merged 16 commits into
mainfrom
copilot/fix-compiler-permissions-issue
May 17, 2026
Merged

feat: infer gh CLI permissions for activation job pre-steps#32849
pelikhan merged 16 commits into
mainfrom
copilot/fix-compiler-permissions-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 17, 2026

The compiler omitted pull-requests: read from the activation job even when a jobs.activation.pre-step called gh pr diff — causing Vale (and similar tools) to silently lint nothing because gh pr diff failed without the permission.

buildActivationPermissions computed activation job permissions based solely on workflow-level features (reactions, label events, etc.) and never inspected run scripts in user-defined pre-steps.

Changes

  • data/gh_cli_permissions.json — New data file mapping gh subcommand groups (pr, issue, workflow, run, release) and REST API path patterns to required GitHub Actions permissions, following the same pattern as github_toolsets_permissions.json.

  • gh_cli_permissions.go — Loads the JSON at init and exposes:

    • inferPermissionsFromShellScripts([]string) — scans shell scripts for gh invocations via regex, returns minimum required permissions
    • extractRunScriptsFromJobPreSteps(map[string]any, string) — extracts run content from jobs.<name>.pre-steps
  • compiler_activation_job_builder.gobuildActivationPermissions now scans jobs.activation.pre-steps and merges inferred permissions into the activation job's permission map.

  • gh_cli_permissions_test.go — Unit tests for all scanner functions plus an integration test reproducing the exact reported scenario.

Example

# This pre-step now causes the compiler to add pull-requests: read
# to the activation job automatically:
jobs:
  activation:
    pre-steps:
      - name: Get changed markdown files
        run: |
          gh pr diff "$PR_NUMBER" --name-only \
            | awk '/\.md$/' \
            > /tmp/gh-aw/docs-review-data/changed-md.txt

Mapping coverage: gh pr *pull-requests, gh issue *issues, gh workflow * / gh run *actions, gh release *contents, gh api /repos/.../pullspull-requests, etc.


Caution

agentic threat detected
Threat detection flagged this output in warn mode. Manual review is REQUIRED before any follow-up automation.

Details

Potential security threats were detected in the agent output.

Review the workflow run logs for details.

Generated by 👨‍🍳 PR Sous Chef ·

Copilot AI and others added 2 commits May 17, 2026 17:04
Create a gh CLI command-to-permissions mapping (data/gh_cli_permissions.json
and gh_cli_permissions.go) and integrate it into buildActivationPermissions so
that activation job pre-steps calling `gh pr diff`, `gh issue view`, etc.
automatically get the required GitHub Actions permissions (pull-requests: read,
issues: read, actions: read, etc.) without manual declarations.

Fixes: compiler omitting pull-requests: read from the activation job even when
a pre-step calls `gh pr diff` (as reported for the gh-aw-docs-review workflow).

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Add word boundary \b to ghSubcommandRE to avoid false positives on partial action words
- Add early len(ctx.data.Jobs) > 0 guard in buildActivationPermissions before extracting pre-step scripts
- Clarify TestActivationJobPermissionsNoPreStepChanges: remove pull-requests: read from frontmatter and add comment explaining the activation job computes permissions independently from the main job

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix compiler omitting pull-requests: read in activation job feat: infer gh CLI permissions for activation job pre-steps May 17, 2026
Copilot AI requested a review from pelikhan May 17, 2026 17:08
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot run "gh help" to scan additional commands available, fill up the mapping.

Attempts to use write APIs should be flagged as errors since permissions: writes are not allowed

Copilot AI and others added 2 commits May 17, 2026 17:19
…activation pre-steps

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 17, 2026

@copilot run "gh help" to scan additional commands available, fill up the mapping.

Attempts to use write APIs should be flagged as errors since permissions: writes are not allowed

Done in commits feat: expand gh CLI permissions mapping and fix: add docs reference to write-command error message.

Expanded mapping — ran gh help for all subcommand groups and classified every subcommand as read or write. Added groups: cache (→ actions), repo (→ contents), label (→ issues), plus corrected comment in both pr and issue groups from read to write. API path patterns also consolidated to handle both /repos/... and repos/... forms.

Write-command detectiondetectWriteCommandsInShellScripts() returns the list of write gh commands found in a set of scripts. buildActivationPermissions now fails compilation with a clear error when any activation pre-step calls a write command:

activation pre-step uses write gh command(s) [gh pr comment]; write operations are not
permitted in activation job pre-steps because the activation job runs with read-only
permissions. Move write operations to the agent job steps or use safe-outputs.
See: https://github.github.com/gh-aw/reference/safe-outputs/

Groups intentionally excluded from the mapping (no meaningful GITHUB_TOKEN scope mapping): secret, variable, project, codespace, gist, search, org, attestation, gpg-key, ssh-key.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot map every group to app permissions as well since the user can setup github apps too

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot apply detector to agent job and activation job

Copilot AI and others added 3 commits May 17, 2026 17:48
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…uild context

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 17, 2026

@copilot map every group to app permissions as well since the user can setup github apps too

Done across two commits (feat: map gh CLI groups to GitHub App permissions and refactor: dynamic regex from JSON keys, cache pre-step inference in build context).

Mapping — Every group in gh_cli_permissions.json now has app_read_permissions and app_write_permissions fields. Notable additions beyond the existing Actions scopes:

  • workflow write → workflows (GitHub App-only)
  • repo write → administration (GitHub App-only)
  • codespace (new group) → codespaces (GitHub App-only, no GITHUB_TOKEN equivalent)

The API path patterns also gained app_permissions; new patterns added for /repos/.../environments, /repos/.../hooks, /orgs/.../members, /orgs/.../teams, and /orgs/.../hooks.

Runtime useinferPermissionsFromShellScripts now returns both GitHub Actions and GitHub App-only scopes in the same map. In addActivationFeedbackAndValidationSteps, any App-only scopes inferred from pre-steps are forwarded to buildActivationAppTokenMintStep so the minted App token includes those scopes (e.g., permission-codespaces: read when a pre-step calls gh codespace list).

Maintenance — The subcommand regex is now compiled dynamically from the JSON group keys, so adding a new group to the JSON is sufficient — no regex update required.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot this applies to pre-steps, steps, post-steps, pre-agent-steps...

Copilot AI and others added 2 commits May 17, 2026 18:19
…steps

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…s check, fix test SHA

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 17, 2026

@copilot apply detector to agent job and activation job

Done. The write-command detector and read-permission inference now apply to both the activation job (already in place) and the agent job pre-steps.

Changes in compiler_main_job.go (buildMainJob):

  • Collects run: scripts from both data.PreSteps (top-level pre-steps:) and data.Jobs["agent"]["pre-steps"]
  • Write commands → compile error (same message pattern as activation job)
  • Read commands → inferred permissions merged into the agent job's permissions block
  • Skipped only when the user explicitly set permissions: {} (uses the same exact-string check as tools.go)

Supporting additions: extractRunScriptsFromPreStepsYAML (parses the data.PreSteps YAML string) in gh_cli_permissions.go and mergeInferredIntoPermissionsYAML in permissions_operations.go.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 17, 2026

@copilot this applies to pre-steps, steps, post-steps, pre-agent-steps...

Done in commit feat: extend gh CLI scan to all step section types (steps, post-steps, pre-agent-steps).

What changed:

  • gh_cli_permissions.go — Added generic extractRunScriptsFromSectionYAML(sectionYAML, sectionName string) and extractRunScriptsFromJobSection(jobs, jobName, sectionName string) helpers. The old extractRunScriptsFromPreStepsYAML / extractRunScriptsFromJobPreSteps wrappers are preserved for backward compatibility.

  • Agent job (compiler_main_job.go) — Now scans all four top-level step fields (PreSteps, CustomSteps, PreAgentSteps, PostSteps) plus the corresponding jobs.agent.* sections (pre-steps, steps, pre-agent-steps, post-steps). Write detection and permission inference apply to all of them.

  • Activation job (compiler_activation_job_builder.go) — Now scans jobs.activation.pre-steps, jobs.activation.steps, and jobs.activation.post-steps. pre-agent-steps is intentionally excluded since it is an agent-job-only concept.

15 new tests cover extractRunScriptsFromSectionYAML and extractRunScriptsFromJobSection for all section types, plus integration tests verifying inference and write-detection for steps, post-steps, and pre-agent-steps in both jobs.

@pelikhan pelikhan marked this pull request as ready for review May 17, 2026 19:07
Copilot AI review requested due to automatic review settings May 17, 2026 19:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds inference of GitHub CLI permissions from workflow step scripts so generated jobs receive read permissions needed by gh commands.

Changes:

  • Adds a JSON mapping and Go scanner for gh subcommands/API paths to permission scopes.
  • Merges inferred permissions into activation and agent job permission handling.
  • Adds unit and integration tests for permission inference and write-command validation.
Show a summary per file
File Description
pkg/workflow/permissions_operations.go Adds helper for merging inferred permissions into existing YAML.
pkg/workflow/gh_cli_permissions.go Adds embedded permission data loading, script scanning, and step run-script extraction helpers.
pkg/workflow/gh_cli_permissions_test.go Adds tests for inference, extraction, merge behavior, and compile scenarios.
pkg/workflow/data/gh_cli_permissions.json Defines gh subcommand/API path to permission mappings.
pkg/workflow/compiler_main_job.go Infers and validates gh usage for agent job step sections.
pkg/workflow/compiler_activation_job.go Propagates errors from activation permission building.
pkg/workflow/compiler_activation_job_builder.go Caches activation scripts and merges inferred permissions into activation job permissions/app token scopes.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 7/7 changed files
  • Comments generated: 3

Comment on lines +87 to +91
// Cache scripts from all step sections and inferred permissions once to avoid redundant
// extraction and inference calls in buildActivationPermissions and
// addActivationFeedbackAndValidationSteps.
activationJobName := string(constants.ActivationJobName)
for _, section := range []string{"pre-steps", "steps", "post-steps"} {
Comment thread pkg/workflow/compiler_main_job.go Outdated
Comment on lines +350 to +351
for _, section := range []string{"pre-steps", "steps", "pre-agent-steps", "post-steps"} {
agentAllScripts = append(agentAllScripts, extractRunScriptsFromJobSection(data.Jobs, agentJobName, section)...)
Comment thread pkg/workflow/gh_cli_permissions.go Outdated
Comment on lines +158 to +160
// ghAPIRE matches `gh api <path>` invocations.
// Capture group: (1) API path (up to the first whitespace, pipe, or quote).
var ghAPIRE = regexp.MustCompile(`(?m)(?:^|[\s|;])gh\s+api\s+([^\s|;&"'\\]+)`)
@github-actions
Copy link
Copy Markdown
Contributor

@copilot review all comments and address any unresolved review feedback.
Then post a short blocker summary if anything still needs maintainer input.

Generated by 👨‍🍳 PR Sous Chef ·

@github-actions
Copy link
Copy Markdown
Contributor

Commit pushed: 56d1068

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · ● 7.2M

@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (1,683 new lines in pkg/) but does not have a linked Architecture Decision Record (ADR).

AI has analyzed the PR diff and generated a draft ADR to help you get started:

📄 Draft ADR: docs/adr/32849-infer-gh-cli-permissions-from-step-scripts.md

What to do next

  1. Review the draft ADR committed to your branch — it was generated from the PR diff and captures the decision to introduce a JSON-driven gh CLI permission inference layer scanning activation and agent job step scripts, with a compile-time error path for write commands.
  2. Complete the missing sections — verify the Context, Decision rationale, and Alternatives match your reasoning; add anything the inference missed.
  3. Commit the finalized ADR to docs/adr/ on your branch (a draft already exists at the path above).
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-32849: Infer gh CLI Permissions from Step Scripts in Activation and Agent Jobs

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

Why an ADR for this PR?

This PR introduces three durable architectural commitments worth recording:

  • A new data-driven mapping surface (pkg/workflow/data/gh_cli_permissions.json) that future contributors will need to keep in sync with the upstream gh CLI.
  • A compile-time prohibition on write gh commands in activation and agent jobs, redirecting authors to safe-outputs — a policy that should be documented as a deliberate design choice, not just buried in a Go error message.
  • A regex-based scanner with documented blind spots (xargs, dynamic command construction). Future contributors evaluating "should we replace this with a real shell parser?" need to know the original trade-off was deliberate.
📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 32849-...md for PR #32849).

🔒 This PR cannot merge until an ADR is linked in the PR body.

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · ● 7.2M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd (new feature with substantial test coverage) and consulted /zoom-out to understand how the new gh_cli_permissions subsystem slots into the compiler pipeline.


Key Themes

1. Broken URL in both error messagesgithub.github.com is not a real domain. Both activation-job and agent-job compile errors point to a 404. (Inline comment on compiler_activation_job_builder.go.)

2. Dead-code appearance in inferPermissionsFromShellScripts — The write-command fallthrough branches (writeCommands / appWriteCommands) are never reachable in practice because all callers error-out on write commands first via detectWriteCommandsInShellScripts. Either remove the branches or add a test that exercises them directly to document the intent. (Inline comment on gh_cli_permissions.go.)

3. Missing test for unknown-subcommand group-level fallback — When gh pr some-future-cmd is used and the action is not in any known list, the code silently falls back to group-level read permissions. This important behaviour has no test pinning it. (Inline comment on test file.)

4. Fragile permissions: {} string comparison — The exact-string match data.Permissions != "permissions: {}" is a hidden contract between the YAML serialiser and this check. A constant or predicate would make the intent explicit and resilient to serialiser changes. (Inline comment on compiler_main_job.go.)

5. Test naming style — 15+ TestFunc_Case functions instead of table-driven t.Run subtests. Not a blocker, but inconsistent with the project's testing conventions. (Inline comment on test file.)


Positive Highlights

  • Data-driven architecture: Externalising the gh subcommand → permission mapping into gh_cli_permissions.json means adding new subcommands requires zero Go changes — a genuinely deep module design.
  • Caching in context: Storing activationAllScripts and activationInferredPerms on activationJobBuildContext avoids redundant YAML parsing and regex scanning across multiple call sites.
  • Compile-time write-command detection: Surfacing gh pr create / gh issue close in pre-steps as a compile error rather than a silent runtime failure is great DX.
  • Integration tests: TestActivationJobPermissionsWithGhPrDiffPreStep reproduces the exact user-reported scenario end-to-end — this is exactly the right style of regression test.
  • init() panic on bad JSON: Failing fast at startup rather than silently producing wrong permissions is the right trade-off for an embedded config file.

Verdict

The URL typo and the permissions: {} fragility are the two items I'd fix before merge; the others are lower priority. Overall the implementation is solid and the design is extensible.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 9.8M

@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test quality analysis by Test Quality Sentinel · ● 14.3M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 69/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). All 63 new tests verify behavioral contracts with no mock library usage and a valid build tag. Minor suggestions: consolidate the 20+ single-assertion TestInferPermissionsFromShellScripts_* tests into a table-driven test, and add assertion messages to ~16 bare assertions.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all comments

@github-actions
Copy link
Copy Markdown
Contributor

@copilot review all comments
Please address the unresolved review feedback on this PR.

Generated by 👨‍🍳 PR Sous Chef ·

@github-actions
Copy link
Copy Markdown
Contributor

@copilot please link the ADR in the PR body and summarize any remaining blockers.

Generated by 👨‍🍳 PR Sous Chef ·

Copilot AI and others added 2 commits May 17, 2026 19:50
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…erage

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 17, 2026

@copilot review all comments

Addressed all three review comments in commits fix: restrict gh CLI scanning to actually-injected step sections and fix: handle escape sequences in splitShellTokens and improve test coverage:

Review comment 1 (compiler_activation_job_builder.go:87-91) — The activation job now only scans jobs.activation.pre-steps. The loop over ["pre-steps", "steps", "post-steps"] was replaced with a single extractRunScriptsFromJobSection(..., "pre-steps") call, matching what applyBuiltinJobPreSteps actually injects. The two tests that expected jobs.activation.steps to trigger errors/inference were replaced by TestActivationJobStepsNotScanned, which verifies the write command in jobs.activation.steps neither causes a compile error nor appears in the generated lock file.

Review comment 2 (compiler_main_job.go:350-351) — The agent job's jobs.agent.* scan was similarly narrowed: the loop over ["pre-steps", "steps", "pre-agent-steps", "post-steps"] was replaced with a single pre-steps lookup. Top-level frontmatter fields (data.PreSteps, data.CustomSteps, data.PreAgentSteps, data.PostSteps) continue to be fully scanned since they are all applied to the agent job.

Review comment 3 (gh_cli_permissions.go:158-160)ghAPIRE was replaced with extractGHAPIEndpoints + parseGHAPIEndpoint + splitShellTokens. The new approach tokenizes the arguments after gh api, skips flags and their value arguments (using ghAPIValueFlags), and returns the first non-flag token with surrounding quotes stripped. Escape sequences (\", \') are handled correctly. New tests cover gh api -H '...' /endpoint, gh api --method GET /endpoint, and gh api "/endpoint".

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot merge main and recompile

@github-actions
Copy link
Copy Markdown
Contributor

Caution

agentic threat detected
Threat detection flagged this output in warn mode. Manual review is REQUIRED before any follow-up automation.

Details

Potential security threats were detected in the agent output.

Review the workflow run logs for details.

@copilot sync with the base branch and rerun checks.
Please address the remaining blockers so this PR is easier to review.

Generated by 👨‍🍳 PR Sous Chef ·

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 17, 2026

@copilot merge main and recompile

Done. Merged main (1bd3ac70) into the branch and recompiled all 229 workflows — no lock file changes were needed (commit 0b874d60).

@pelikhan pelikhan merged commit d2866a6 into main May 17, 2026
@pelikhan pelikhan deleted the copilot/fix-compiler-permissions-issue branch May 17, 2026 20:40
Copilot stopped work on behalf of pelikhan due to an error May 17, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compiler omits pull-requests: read from activation job despite Vale pre-step using gh pr diff

3 participants