The pr-review-poster was flagging `gtest/gtest.h file not found` on any
PR that added or modified a test file, because clang-tidy-diff-18.py
ran against files that weren't in the compilation database. PR #27004
and PR #26233 both hit this. The root cause is that test TUs only
enter compile_commands.json when BUILD_TESTING is ON, which the
historical clang-tidy build does not enable.
This PR fixes both halves of the problem:
1. Add a second make target `px4_sitl_default-clang-test` that configures
a separate build dir with -DCMAKE_TESTING=ON. Test TUs land in its
compile_commands.json with resolved gtest/fuzztest include paths.
2. Add an umbrella `clang-ci` target that depends on both
`px4_sitl_default-clang` and `px4_sitl_default-clang-test` so the PR
job prepares both build dirs with one make invocation.
3. On PR events the workflow uses `make clang-ci`, installs
libclang-rt-18-dev (needed so fuzztest's FUZZTEST_FUZZING_MODE flags
do not fail the abseil try_compile with a misleading "pthreads not
found" error), and routes the clang-tidy-diff producer at the
test-enabled build dir.
4. Push-to-main is left entirely alone: same single build dir, same
`make px4_sitl_default-clang`, same `make clang-tidy`. Test files
are not in that DB so run-clang-tidy.py keeps ignoring them exactly
as before. This preserves green main while ~189 pre-existing
clang-tidy issues in test files remain untouched; fixing those is
out of scope for this change.
5. Replace the fragile `:!*/test/*` pathspec filter (which missed flat
`*Test.cpp` files in module roots) with
`Tools/ci/clang-tidy-diff-filter.py`, which reads the compilation
database and drops any changed source file that is not a TU.
Headers always pass through. Production code that happens to use
test-like names (src/systemcmds/actuator_test, src/drivers/test_ppm,
etc.) stays analyzed because those are real px4_add_module targets.
Verified in the ghcr.io/px4/px4-dev:v1.17.0-rc2 container and on the
real CI runner:
- cmake configure with CMAKE_TESTING=ON succeeds after installing
libclang-rt-18-dev (Found Threads: TRUE)
- compile_commands.json grows from 1333 to 1521 TUs
- Modifying HysteresisTest.cpp with a new `const char *p = NULL`
correctly flags hicpp-use-nullptr and
clang-diagnostic-unused-variable on the new line, while pre-existing
issues on other lines of the same file stay suppressed by
clang-tidy-diff-18.py's line filter ("Suppressed ... 1 due to line
filter")
- No gtest/gtest.h false positives
- Push-to-main path unchanged, still green
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Replace make quick_check with two explicit build targets:
px4_sitl_default (validates native SITL toolchain) and
px4_fmu-v5_default (validates NuttX cross-compile toolchain).
quick_check built four targets: px4_sitl_test, px4_fmu-v5_default,
tests, and check_format. The tests and check_format targets are
redundant with checks.yml which already runs them on 8cpu RunsOn
with ccache.
The purpose of this workflow is to validate that PX4 builds from a
fresh ubuntu.sh install on both Ubuntu 22.04 and 24.04, not to run
tests or check formatting. Two targeted builds are sufficient.
px4_fmu-v5_default is kept as the hardware target (same as
quick_check) since it builds with the arm-none-eabi-gcc version
that ubuntu.sh installs on both 22.04 and 24.04.
Expected duration drop from 16-17 min to 6-8 min per matrix entry.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The v1.17.0-rc2 container's clang 18 + cmake 3.28 combination fails
abseil's cmake try_compile tests for C++17 and pthreads. This breaks
the fuzztest build which depends on abseil. Verified locally:
- px4io/px4-dev:v1.16.0-rc2 + apt install clang: cmake configure passes
- ghcr.io/px4/px4-dev:v1.17.0-rc2 (clang 18 pre-installed): cmake
configure fails with "ABSL_INTERNAL_AT_LEAST_CXX17 - Failed" and
"Could NOT find Threads"
- apt install clang on v1.17.0-rc2 is a no-op (already installed)
Revert to the old container image which has a working clang+cmake
combination. The apt install clang step (already in the workflow)
installs clang on the old container which doesn't ship it by default.
Remove the explicit fetch-depth: 0 added in the previous fix attempt
since the original workflow used the default depth (1) and it worked.
Fixes#27060
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Branch protection rules block the GITHUB_TOKEN from dismissing reviews
(HTTP 403), so every push added another undismissable REQUEST_CHANGES
review. PR #27004 accumulated 12 identical blocking reviews.
Switch to COMMENT-only reviews. Findings still show inline on the diff
but don't create blocking reviews that require manual maintainer
dismissal. The CI check status (pass/fail) gates merging, not the
review state.
Also enable CMAKE_TESTING=ON in the clang-tidy build so test files get
proper include paths in compile_commands.json. Without this,
clang-tidy-diff runs on test files from the PR diff but can't resolve
gtest headers, producing false positives.
Fixes#27004
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Migrate the fuzzing workflow from GitHub-hosted ubuntu-latest to
RunsOn 4cpu with s3-cache. Bump the container from the stale
px4io/px4-dev:v1.16.0-rc2 to ghcr.io/px4/px4-dev:v1.17.0-rc2.
Wire setup-ccache / save-ccache with cache-key-prefix ccache-sitl
and max-size 300M, sharing the SITL build cache with checks:tests.
Both build px4_sitl_test/px4_sitl_default so the ccache contents
overlap significantly.
Drop the manual apt install clang step since the v1.17.0-rc2
container already ships clang. Replace the git config --global
safe.directory workaround with --system to match the repo convention.
Add runs-on/action@v2 for the S3 cache proxy. Add fetch-depth: 1
since the fuzzer doesn't need git history.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Bump every GitHub Action in the repository to its latest major
version, addressing the upcoming Node.js 20 deprecation. Several
of the old versions (checkout v4, cache v4, setup-node v4,
labeler v5) use the Node 20 runtime which GitHub is deprecating.
The new versions use Node 22.
- actions/checkout v4/v5 to v6
- actions/upload-artifact v4 to v7
- actions/download-artifact v4 to v8
- actions/cache, cache/restore, cache/save v4 to v5
- actions/setup-node v4 to v6
- actions/setup-python v5 to v6
- actions/github-script v7/v8 to v9
- actions/labeler v5 to v6
- peter-evans/find-comment v3 to v4
- dorny/paths-filter v3 to v4
- codecov/codecov-action v4 to v6
- docker/setup-buildx-action v3 to v4
- docker/build-push-action v6 to v7
- tj-actions/changed-files v46 to v47
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Consolidate mavros_mission_tests.yml and mavros_offboard_tests.yml into a
single mavros_tests.yml with a matrix strategy. Switch from docker-in-docker
with px4-dev-ros-melodic to a native container using px4-dev-ros-noetic,
enabling ccache and composite actions (setup-ccache, build-gazebo-sitl,
save-ccache). Migrate all five MAVROS Python test files from Python 2 to
Python 3 (remove six/xrange, from __future__ imports, replace px4tools
with pyulog for estimator analysis). Bump git-auto-commit-action from v4
to v7 in ekf_update_change_indicator.yml.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The stale workflow was hitting its 250 operations-per-run cap every
daily run, causing the "No more operations left! Exiting..." warning
and leaving a growing backlog of stale-labeled items that were never
being closed. GitHub API headroom is plentiful (250 ops uses ~1.6% of
the 15k/hour bucket), so raising to 1500 drains the backlog without
any rate-limit risk.
Also adds workflow_dispatch so maintainers can trigger the workflow
from the Actions tab or via gh workflow run stale.yml.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Port the checks.yml and python_checks.yml improvements from the CI
orchestrator branch (mrpollo/ci_orchestration, PR #26257) without
doing the full T1/T2 split.
checks.yml:
- Drop 5 matrix entries the orchestrator removed:
tests_coverage, px4_fmu-v2_default stack_check,
NO_NINJA_BUILD=1 px4_fmu-v5_default,
NO_NINJA_BUILD=1 px4_sitl_default, px4_sitl_allyes.
- Remove the codecov/codecov-action@v1 step (deprecated, only ran
for the dropped tests_coverage entry).
- Wire the setup-ccache / save-ccache composite actions around
make tests (cache-key-prefix ccache-sitl, max-size 300M) so
repeat runs reuse the SITL build tree. Matches the orchestrator
basic-tests job 1:1.
python_checks.yml:
- Replace the apt-get install python3 + pip install
--break-system-packages + hardcoded $HOME/.local/bin paths with
actions/setup-python@v5 pinned to 3.10 and plain pip install.
- Linters now run from PATH instead of $HOME/.local/bin.
Stacks on top of mrpollo/ci-checkout-hygiene (#27032) which shipped
fail-fast: true, fetch-depth: 1, and the safe.directory step
extraction.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Port checkout hygiene from the CI orchestrator branch
(mrpollo/ci_orchestration) to current workflows without merging the
orchestrator itself.
- checks.yml: enable fail-fast (99% success rate observed, cancel on
first failure saves runner time), switch to fetch-depth 1, extract
safe.directory to its own step
- itcm_check.yml: fetch-depth 1, drop submodules: recursive (the
Makefile bootstraps submodules as a prerequisite of board targets)
- sitl_tests.yml, ros_integration_tests.yml, mavros_mission_tests.yml,
mavros_offboard_tests.yml, python_checks.yml: fetch-depth 1
Each change matches the corresponding job in ci-orchestrator.yml on
mrpollo/ci_orchestration 1:1. Workflows that legitimately need history
(clang-tidy, flash_analysis, failsafe_sim, ros_translation_node,
ekf_*_change_indicator, build_all_targets) are left alone.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Delete the nuttx_env_config workflow. It validated the
PX4_EXTRA_NUTTX_CONFIG env var handling in
platforms/nuttx/NuttX/CMakeLists.txt by building px4_fmu-v5_default
with CONFIG_NSH_LOGIN_PASSWORD injected at configure time.
The CI orchestrator rewrite (mrpollo/ci_orchestration, PR #26257) drops
this workflow entirely. The cmake feature itself remains; only the CI
gate is removed.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Upgrade the RunsOn runner for sitl_tests and ros_integration_tests
from 4cpu-linux-x64 / ubuntu22-full-x64 to 8cpu-linux-x64 /
ubuntu24-full-x64 with extras=s3-cache.
Matches the runner_medium spec used by the sitl-tests and
ros-integration-tests jobs in the CI orchestrator branch
(mrpollo/ci_orchestration). Both jobs are compile-heavy and benefit
from the 2x core count. The ubuntu24 image and s3-cache extras align
with the house style already used by clang-tidy, dev_container,
docs_deploy, docs-orchestrator, and build_deb_package.
No other changes (speed factor unchanged, container images unchanged).
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(pr-review-poster): add line-anchored review poster and migrate clang-tidy
Adds a generic PR review-comment poster as a sibling of the issue-comment
poster from #27021. Replaces platisd/clang-tidy-pr-comments@v1 in the
Static Analysis workflow with an in-tree, fork-friendly producer + poster
pair so fork PRs get inline clang-tidy annotations on the Files changed
tab without trusting a third-party action with a write token.
Architecture mirrors pr-comment-poster: a producer (clang-tidy.yml) runs
inside the px4-dev container and writes a `pr-review` artifact containing
manifest.json and a baked comments.json. A separate workflow_run-triggered
poster runs on ubuntu-latest with the base-repo write token, validates the
artifact, dismisses any stale matching review, and posts a fresh review
on the target PR. The poster never checks out PR code and only ever reads
two opaque JSON files from the artifact.
Stale-review dismissal is restricted to reviews authored by
github-actions[bot] AND whose body contains the producer's marker. A fork
cannot impersonate the bot login or inject the marker into a human
reviewer's body, so the poster can never dismiss a human review. APPROVE
events are explicitly forbidden so a bot cannot approve a pull request.
To avoid duplicating ~120 lines of HTTP plumbing between the two posters,
the GitHub REST helpers (single-request, pagination, error handling) are
extracted into Tools/ci/_github_helpers.py with a small GitHubClient
class. The existing pr-comment-poster.py is refactored to use it; net
change is roughly -80 lines on that script. The shared module is
sparse-checked-out alongside each poster script and is stdlib only.
The clang-tidy producer reuses MIT-licensed translation logic from
platisd/clang-tidy-pr-comments (generate_review_comments,
reorder_diagnostics, get_diff_line_ranges_per_file and helpers) under a
preserved attribution header. The HTTP layer is rewritten on top of
_github_helpers so the producer does not pull in `requests`. Conversation
resolution (the GraphQL path) is intentionally dropped for v1.
clang-tidy.yml now produces the pr-review artifact in the same job as
the build, so the cross-runner compile_commands.json hand-off and
workspace-path rewriting are no longer needed and the
post_clang_tidy_comments job is removed.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(workflows): bump action versions to clear Node 20 deprecation
GitHub has deprecated the Node 20 runtime for Actions as of
September 16, 2026. Bump the pinned action versions in the three poster
workflows to the latest majors, all of which run on Node 24:
actions/checkout v4 -> v6
actions/github-script v7 -> v8
actions/upload-artifact v4 -> v7
No behavior changes on our side: upload-artifact v5/v6/v7 only added an
optional direct-file-upload mode we do not use, and checkout v5/v6 are
runtime-only bumps. The security-invariant comment headers in both
poster workflows are updated to reference the new version so they stay
accurate.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(pr-posters): skip job when producer was not a pull_request event
Both poster workflows previously ran on every workflow_run completion of
their listed producers and then silently no-oped inside the script when
the triggering producer run was a push-to-main (or any other non-PR
event). That made the UI ambiguous: the job was always green, never
showed the reason it did nothing, and looked like a failure whenever
someone clicked in looking for the comment that was never there.
Gate the job at the workflow level on
github.event.workflow_run.event == 'pull_request'. Non-PR producer runs
now surface as a clean "Skipped" entry in the run list, which is
self-explanatory and needs no in-script summary plumbing.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
---------
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Adds documentation for the SITL containers and .deb packages introduced in #26495. The containers are now live on Docker Hub: [`px4io/px4-sitl:latest`](https://hub.docker.com/r/px4io/px4-sitl) and [`px4io/px4-sitl-gazebo:latest`](https://hub.docker.com/r/px4io/px4-sitl-gazebo).
The main addition is a [Try PX4 Simulation](https://docs.px4.io/main/en/dev_setup/try_px4) page that leads with a single `docker run` command and gets someone flying in under a minute. It lives in Getting Started, right after Recommended Hardware/Setup, so it's one of the first things new users see.
The existing `.deb` package reference has been moved from `packaging/px4_sitl_deb.md` to `simulation/px4_sitl.md` and expanded to cover both containers and `.deb` packages on one page. Sections are ordered by how people use them: what's available, install, configure, connect QGC/MAVSDK, connect ROS 2.
Other changes:
- README now has a "Try PX4" section with the docker one-liner above "Build from Source"
- Landing page (`index.md`) reworked to lead with "Try PX4" before "For Developers"
- Toolchain page (`dev_env.md`) gets a tip redirecting simulation-only users to pre-built packages
- `getting_started.md` and `SUMMARY.md` updated with links to the new pages
- Simulation index tip updated to mention containers alongside `.deb` packages
The SIH container image is published as `px4io/px4-sitl` (renamed from `px4io/px4-sitl-sih`) so the default lightweight option carries the simplest name. The Gazebo image remains `px4io/px4-sitl-gazebo`.
Also upgrades all GitHub Actions in the SITL workflow to Node.js 24 compatible versions (`actions/checkout@v6`, `actions/cache@v5`, `actions/upload-artifact@v7`, `actions/download-artifact@v8`, `docker/setup-buildx-action@v4`, `docker/build-push-action@v7`) to fix the Node.js 20 deprecation warning ahead of the June 2026 deadline.
---------
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Co-authored-by: Hamish Willee <hamishwillee@gmail.com>
Adds a stand-alone workflow that posts or updates sticky PR comments on
behalf of any analysis workflow, including those triggered by fork PRs.
The poster runs on `workflow_run` in the base repo context, which is the
standard GitHub-sanctioned way to get a write token on events that
originate from untrusted forks without ever checking out fork code.
All validation, GitHub API interaction, and upsert logic lives in
Tools/ci/pr-comment-poster.py (Python 3 stdlib only, two subcommands:
`validate` and `post`). The workflow file itself is a thin orchestrator:
sparse-checkout the script, download the pr-comment artifact via
github-script, unzip, then invoke the script twice. No inline jq, no
inline bash validation, no shell-interpolated marker strings. The
sparse-checkout ensures only Tools/ci/pr-comment-poster.py lands in the
workspace, never the rest of the repo.
Artifact contract: a producer uploads an artifact named exactly
`pr-comment` containing `manifest.json` (with `pr_number`, `marker`, and
optional `mode`) and `body.md`. The script validates the manifest
(positive integer pr_number, printable-ASCII marker bounded 1..200
chars, UTF-8 body under 60000 bytes, mode in an allowlist), finds any
existing comment containing the marker via the comments REST API, and
either edits it in place or creates a new one.
The workflow file header documents six security invariants that any
future change MUST preserve, most importantly: NEVER check out PR code,
NEVER execute anything from the artifact, and treat all artifact
contents as opaque data.
Why a generic poster and not `pull_request_target`: `pull_request_target`
is the tool people reach for first and the one that most often turns
into a supply-chain vulnerability, because it hands a write token to a
workflow that is then tempted to check out the PR head. `workflow_run`
gives the same write token without any check-out temptation, because
the only input is a pre-produced artifact treated as opaque data.
Producer migrations
===================
flash_analysis.yml:
- Drop the fork gate on the `post_pr_comment` job.
- Drop the obsolete TODO pointing at issue #24408 (the fork-comment
workflow does not error anymore; it just no-ops).
- Keep the existing "comment only if threshold crossed or previous
comment exists" behaviour verbatim. peter-evans/find-comment@v3
stays as a read-only probe (forks can read issue comments just fine);
its body-includes is updated to search for the new marker
`<!-- pr-comment-poster:flash-analysis -->` instead of the old
"FLASH Analysis" heading substring.
- Replace the peter-evans/create-or-update-comment@v4 step with two
new steps that write pr-comment/manifest.json and pr-comment/body.md
and then upload them as artifact pr-comment. The body markdown is
byte-for-byte identical to the previous heredoc, with the marker
prepended as the first line so subsequent runs can find it.
- The threshold-or-existing-comment gate is preserved on both new
steps. When the gate does not fire no artifact is uploaded and the
poster no-ops.
docs-orchestrator.yml (link-check job):
- Drop the fork gate on the sticky-comment step.
- Replace marocchino/sticky-pull-request-comment@v2 with two new steps
that copy logs/filtered-link-check-results.md into pr-comment/body.md,
write a pr-comment/manifest.json with the marker
`<!-- pr-comment-poster:docs-link-check -->`, and upload the directory
as artifact pr-comment.
- The prepare step checks `test -s` on the results file and emits a
prepared step output; the upload step is gated on that output. In
practice the existing link-check step always writes a placeholder
("No broken links found in changed files.") into the file when empty,
so the guard is defensive but not load-bearing today.
- Tighten the link-check job's permissions from `pull-requests: write`
down to `contents: read`; writing PR comments now happens in the
poster workflow.
The poster's workflows allowlist is seeded with the two active
producers: "FLASH usage analysis" and "Docs - Orchestrator".
clang-tidy (workflow name "Static Analysis") is not in the list because
platisd/clang-tidy-pr-comments posts line-level review comments, a
different REST API from issue comments that the poster script does not
handle. Extending the poster to cover review comments is a follow-up.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Switch the Static Analysis workflow to two modes:
- Push to main: run the full "make clang-tidy" target as before.
- Pull request: build the clang compile database with
"make px4_sitl_default-clang", then call Tools/ci/run-clang-tidy-pr.py
(already in-tree) to compute the translation units actually affected
by the PR diff and run clang-tidy only on that subset. PRs that touch
no C++ files exit silently; the large majority of PRs will skip the
slow full analysis entirely.
Replace the inline ccache restore/config/save steps with the composite
actions from .github/actions/setup-ccache and .github/actions/save-ccache,
which use content-hash cache keys (prefix-ref-sha with ref and base_ref
fallbacks), compression, and compiler_check=content. Same 120M cap.
Add a second job, post_clang_tidy_comments, that runs on a GitHub-hosted
runner when the analysis job reports has_findings=true. It downloads the
compile_commands.json artifact produced by the analysis job, rewrites
the AWS RunsOn workspace prefix (/__w/PX4-Autopilot/PX4-Autopilot) to the
GitHub-hosted runner workspace so clang-tidy can chdir into the build
directory, runs clang-tidy-diff-18 to export fixes, and posts inline
review annotations via platisd/clang-tidy-pr-comments@v1.
Annotations are set to request changes (request_changes: true), so a PR
with new clang-tidy findings will be blocked until they are addressed or
waived. suggestions_per_comment is capped at 10. Annotations are gated
to same-repo PRs only; forks skip the annotation job because GITHUB_TOKEN
has no write access there.
The post_clang_tidy_comments job uses if: always() && ... so it runs
whether the analysis job succeeded or failed (findings still need to be
surfaced when the analysis exits non-zero).
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Standardize on the GitHub Container Registry copy of px4-dev:v1.17.0-rc2
across workflows still pulling the old dockerhub v1.16.0-rc1 image, and
move the workflows that were already on v1.17.0-beta1 from docker.io to
ghcr.io so the whole repo pulls from one registry at the same version.
Also modernize the "git ownership workaround" in the touched workflows
that still used `git config --global --add safe.directory "$GITHUB_WORKSPACE"`
to the `--system --add safe.directory '*'` form already in use by
clang-tidy, flash_analysis, failsafe_sim, itcm_check, and docs-orchestrator.
Updated workflows:
- checks.yml
- clang-tidy.yml (was on v1.17.0-beta1, now on rc2)
- docs-orchestrator.yml (was on v1.17.0-beta1, two jobs)
- ekf_functional_change_indicator.yml
- ekf_update_change_indicator.yml
- failsafe_sim.yml
- flash_analysis.yml
- itcm_check.yml
- nuttx_env_config.yml
Deliberately out of scope for this PR and deferred to focused follow-ups:
- fetch-depth: 0 to 1 (firmware builds and flash_analysis base-ref
checkout need git history)
- PX4_SBOM_DISABLE removal in checks.yml (behavioral change)
- fail-fast: false to true (behavioral change)
- codecov-action upgrade
No other workflows touched. compile_ubuntu.yml, ros_integration_tests.yml,
sitl_tests.yml, mavros_*_tests.yml, fuzzing.yml, build_deb_package.yml,
dev_container.yml all use different image families or serve different
purposes and are not part of this sweep.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Add four reusable building blocks that upcoming CI optimization PRs will
consume. No existing workflow is modified; these files are dormant until
referenced.
- .github/actions/setup-ccache: restore ~/.ccache with content-hash keys,
write ccache.conf with compression and content-based compiler check
- .github/actions/save-ccache: print stats and save the cache under the
primary key produced by setup-ccache
- .github/actions/build-gazebo-sitl: build px4_sitl_default plus the
Gazebo Classic plugins with ccache stats between stages
- Tools/ci/run-clang-tidy-pr.py: compute the translation units affected
by a PR diff and invoke Tools/run-clang-tidy.py on that subset only,
exiting silently when no C++ files changed
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The previous logic used GITHUB_HEAD_REF, which on a pull request is
the source (PR author's) branch name. For backport PRs (e.g.
mrpollo/backport-26781-1.17), no matching branch exists in
px4-ros2-interface-lib, so the script fell back to main and the
build broke from uORB message divergence.
Switch to GITHUB_BASE_REF, which on a PR is the branch the code is
being merged into (main or release/X.Y), and fall back to
GITHUB_REF_NAME for direct pushes. This always resolves to a real
branch in px4-ros2-interface-lib.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Add 'if: startsWith(runner.name, "runs-on--")' to the mirror swap step
in both workflows so fork users can see at a glance that the step only
fires on runs-on runners and is a no-op on standard GitHub-hosted
runners. The script keeps its internal RUNS_ON_AWS_REGION check as
defense in depth for callers outside these workflows.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The mirror swap was duplicated across two workflows. Move it into
Tools/ci/use_aws_apt_mirror.sh and call the script from each workflow
after checkout but before any heavy apt work like Tools/setup/ubuntu.sh.
The script no-ops outside runs-on (RUNS_ON_AWS_REGION unset), so it is
safe to call from forks, self-hosted runners, or local container runs
without changing behavior there. The region is read from the runs-on
environment instead of being hardcoded, so future region changes only
need updating where the runner is provisioned.
The bootstrap 'apt install git' step keeps the default mirror because
git is one package and is unlikely to hit the dep11 desync issue that
broke ubuntu.sh.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The compile_ubuntu workflow's apt operations talk directly to
archive.ubuntu.com, which round-robins across community mirrors that
occasionally serve out-of-sync index files mid-sync and break apt update
for everyone until the upstream catches up.
Apply the same mirror swap as build_deb_package.yml: rewrite the
container's apt sources to point at us-west-2.ec2.archive.ubuntu.com
before any apt operation runs, so both the inline 'apt update' and the
later Tools/setup/ubuntu.sh call benefit from the regional mirror.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The SIH image is the canonical PX4 SITL container, so drop the redundant
-sih suffix and publish it as px4io/px4-sitl. Gazebo continues to publish
as px4io/px4-sitl-gazebo.
Decouples the published image name from the matrix.image identifier by
introducing a matrix.repo field, so renames like this don't require
touching the matrix logic.
This is a breaking change for anyone pulling px4io/px4-sitl-sih directly;
the old tags remain available but no new ones will be published under
that name.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The default archive.ubuntu.com round-robin can serve out-of-sync index
files mid-sync, which makes apt-get update fail with 'File has unexpected
size' errors and breaks the deb build job for everyone until the upstream
mirror catches up.
Rewrite the container's apt sources to point at us-west-2.ec2.archive.
ubuntu.com instead. The EC2 archive mirrors are Canonical-operated,
region-local to the runs-on instances, and sync aggressively, eliminating
the round-robin lottery as a CI failure mode.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
The current workflow_dispatch path builds whatever HEAD of the dispatch ref
is, labels the resulting image with px4_version, and publishes. That's
fine for rebuilding current state but it cannot rebuild the exact commit
a release tag points to, because the dispatch loads the workflow file
from one ref and implicitly checks out the same ref for the build.
This matters for release recovery. When the v1.17.0-rc2 tag push failed
to publish containers back on 2026-03-13 (the v1 GHA cache protocol
removal in RunsOn v2.12.0), the tag was not re-pushed, so the only way
to publish rc2 containers now is via workflow_dispatch. Without this
change, a dispatch against release/1.17 builds release/1.17 HEAD and
labels it v1.17.0-rc2, which produces a container whose contents do not
match the rc2 tag's actual code. That is not a faithful recovery.
Add a build_ref input that controls only the checkout ref, defaulting
to empty which falls back to github.ref (preserving current behavior
for both push events and dispatches that omit the input). With this,
a release recovery looks like:
gh workflow run dev_container.yml --repo PX4/PX4-Autopilot \
--ref release/1.17 \
-f px4_version=v1.17.0-rc2 \
-f build_ref=v1.17.0-rc2 \
-f deploy_to_registry=true
The workflow loads from release/1.17 HEAD (which has the cache fix
from 39b0568 and the hardening from d74db56a), but the build uses
Tools/setup/Dockerfile from the rc2 tag. The published image has
rc2 contents under the rc2 label, as if the original tag push had
worked.
All three actions/checkout steps (setup, build, deploy) take the same
ref expression so every job sees a consistent workspace. Non-dispatch
events (push, PR) evaluate github.event.inputs.build_ref to empty and
fall back to github.ref exactly as before.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Drops --upload from the ROS integration test runner so CI runs no
longer publish ULogs to the public logs.px4.io server on every run.
Failure debugging is unaffected: the existing Upload failed logs step
already captures logs as GitHub Actions artifacts on failure.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Three related fixes to prevent a repeat of the v1.17.0-rc2 incident, where a
post-push GHA cache-export 404 failed the arm64 build after both registry
pushes had already succeeded, fail-fast cancelled amd64, and the deploy job
was skipped, leaving the registries with only a partial arm64 publish and no
multi-arch manifest.
- Mark cache export as non-fatal via ignore-error=true on cache-to. A
successful registry push should never be undone by a cache-layer flake.
This alone would have let rc2 publish correctly.
- Decouple the deploy job from the build job's exit code. Change its if:
gate to !cancelled() + setup success only, and promote the existing
"Verify Images Exist Before Creating Manifest" step from a warning into
a hard precondition. Deploy now runs whenever both per-arch tags actually
exist in the registries, which is its real precondition, and fails loudly
if a tag is missing.
- Bump every action to the current major (runs-on/action v2,
actions/checkout v5, docker/login-action v4, docker/setup-buildx-action v4,
docker/build-push-action v7, docker/metadata-action v6). This gets the
workflow off Node 20 before GitHub's June 2 2026 forced runtime switch
and keeps runs-on/action on the same major as the runs-on platform.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Add :latest tag alongside version tags for per-arch images and
multi-arch manifests on both Docker Hub and GHCR.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Add cmake/cpack infrastructure for building .deb packages from
px4_sitl_sih and px4_sitl_default targets. Includes install rules,
package scripts, Gazebo wrapper, and CI workflow.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Three issues caused the monthly audit to report already-resolved submodules:
1. The audit workflow grepped for "NOASSERTION" anywhere in the output,
matching the Detected column even when the Final column had a valid
override (e.g. libtomcrypt detected as NOASSERTION but overridden to
Unlicense). Changed to grep for "<-- UNRESOLVED" marker instead.
2. Submodules with an explicit NOASSERTION override in license-overrides.yaml
(like libfc-sensor-api, which is proprietary) were still counted as
failures. Now treated as "acknowledged" since someone intentionally
added the override entry.
3. Added missing BSD-3-Clause override for sitl_gazebo-classic (PX4 org
project with no LICENSE file in repo).
Fixes#26932
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(claude): add review-pr skill for domain-aware PR reviews
Add a Claude Code skill that reviews pull requests with checks
tailored to the domains touched (estimation, control, drivers,
simulation, system, CI/build, messages, board additions).
Built from analysis of 800+ PR reviews across 8 PX4 maintainers.
Includes merge strategy recommendation, interactive dialog for
submitting reviews, and human-sounding PR comment formatting.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(copilot): add domain-scoped review instructions for GitHub Copilot
Add .github/instructions/ files that give GitHub Copilot PR reviews
the same domain-aware context as the Claude Code review-pr skill.
Each file is scoped via applyTo to the relevant source paths:
core review, estimation, control, drivers/CAN, simulation, system,
CI/build, messages/protocol, and board additions.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* fix(claude): address Copilot review feedback
- Fix step reference in review-pr skill (step 8 -> step 9)
- Capitalize CMake consistently in skill and Copilot instructions
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
---------
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Add a scalable .deb packaging framework for VOXL2, built on the
existing cmake/package.cmake CPack infrastructure. The framework
handles multi-processor boards by having the POSIX (_default) build
own the .deb and pull in the companion SLPI build's artifacts.
Board-specific files:
- cmake/package.cmake: CPack variable overrides (name, deps, version)
- cmake/install.cmake: install() rules for all .deb contents
- debian/postinst: px4-* symlinks, DSP signature, directory setup
- debian/prerm: service stop, symlink cleanup
- debian/voxl-px4.service: systemd unit (after sscrpcd)
Infrastructure changes:
- cmake/package.cmake: hook for board-specific CPack overrides
- platforms/posix/CMakeLists.txt: hook for board install.cmake
- Makefile: %_deb pattern rule (build _default, then cpack -G DEB)
- CI: auto-discover _deb targets, collect .deb artifacts, upload
to GitHub Releases
Future boards: add cmake/package.cmake + cmake/install.cmake and
CI discovers it automatically. No new file formats or tools needed.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(ros): use matching branch for px4-ros2-interface-lib
When running on release branches, the ROS integration tests now
check if a matching branch exists in px4-ros2-interface-lib and
clone it instead of always using main. This prevents build failures
caused by uORB message divergence between main and release branches.
Fixes https://github.com/Auterion/px4-ros2-interface-lib/issues/184
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(ros): dispatch release branch creation to px4-ros2-interface-lib
Add a standalone workflow triggered by the create event that fires a
repository_dispatch to Auterion/px4-ros2-interface-lib when a
release/X.Y branch is created. Also supports manual workflow_dispatch.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
* ci(ros): add empty permissions block to dispatch workflow
Fixes code scanning alert about missing GITHUB_TOKEN permissions.
This workflow only uses a PAT secret, not GITHUB_TOKEN, so no
permissions are needed.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
---------
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
RunsOn v2.12.0 (March 6, 2026) removed v1 cache toolkit support,
causing the buildx GHA cache proxy to return 404 for v1 endpoints.
This has broken container builds on main since March 12.
Removing the explicit version=1 parameter lets buildkit auto-detect
the v2 protocol, which is the only version now supported by both
GitHub (since April 2025) and RunsOn.
First build after this change will have a cold cache.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Add CI enforcement of conventional commit format for PR titles and
commit messages. Includes three Python scripts under Tools/ci/:
- conventional_commits.py: shared parsing/validation library
- check_pr_title.py: validates PR title format, suggests fixes
- check_commit_messages.py: checks commits for blocking errors
(fixup/squash/WIP leftovers) and advisory warnings (review-response,
formatter-only commits)
The workflow (.github/workflows/commit_checks.yml) posts concise
GitHub PR comments with actionable suggestions and auto-removes them
once issues are resolved.
Also updates CONTRIBUTING.md and docs with the conventional commits
convention.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Persistent flaky failures (timeouts, erratic transitions) make these
tests unreliable in CI. Commented out from the workflow matrix so they
can be re-enabled once the test infrastructure is stabilized. The test
definitions in sitl.json are preserved for local use.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Remove the step that uploaded every version tag to the stable/ S3
directory, which caused QGC users selecting "stable" to receive
pre-release firmware (#26340). The stable/ and beta/ directories
are now controlled exclusively by their respective branch pushes,
while version tags only upload to their versioned archive directory
(e.g., v1.16.1/). Pre-release tags are also correctly marked on
GitHub Releases.
Co-authored-by: Julian Oes <julian@oes.ch>
Fixes#26340
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Rename workflow to "Static Analysis" with job name "Clang-Tidy" for
clearer GitHub Checks UI. Use Title Case action-verb step names.
Switch from runs-on/cache to actions/cache since the runs-on Magic
Cache sidecar transparently handles S3 backing.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
github.ref_name resolves to '26367/merge' for pull_request events,
causing cache misses. Use github.head_ref (PR source branch) with
fallback to github.ref_name for push events.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Use separate cache/restore and cache/save steps with if: always()
on the save step, matching the build_all_targets pattern.
Signed-off-by: Ramon Roche <mrpollo@gmail.com>
- Switch from addnab/docker-run-action to native container directive
- Use runs-on 16-core runner with S3 cache (extras=s3-cache)
- Add ccache setup matching build_all_targets pattern
- Run clang-tidy with -j16 to leverage all cores
Signed-off-by: Ramon Roche <mrpollo@gmail.com>