Add LAVA artifact upload via axum HTTP server (rework of #17)#116
Add LAVA artifact upload via axum HTTP server (rework of #17)#116
Conversation
- Add axum = "0.8" dependency; enable full tokio features
- New src/upload.rs: UploadServer, UploadStore, JobArtifacts (RAII-based,
auto-deregisters on drop, no global static, no periodic cleanup)
- Add runner.ARTIFACT_UPLOAD_URL template variable via TransformVariables.runner
- Add Artifact{path, data} to LavaUploadableFileType (path only, no id prefix)
- Run struct gains upload_server and artifacts fields
- transform() takes upload_url param; submit command creates JobArtifacts
- get_uploadable_files() includes artifacts from submit jobs
- upload_artifact axum handler uses State<Arc<Mutex<UploadStore>>>
- main() starts server only if LAVA_ARTIFACT_UPLOAD_BASE_URL is set
- Single env var replaces separate host/port; uses TcpListener::bind + axum::serve
Agent-Logs-Url: https://github.com/collabora/lava-gitlab-runner/sessions/831d8c7f-5dc6-47d2-82c1-7e79eed0c6bf
Co-authored-by: sjoerdsimons <22603932+sjoerdsimons@users.noreply.github.com>
|
@copilot please fix CI failures |
There was a problem hiding this comment.
Pull request overview
Adds an optional embedded HTTP upload endpoint so LAVA “submit” jobs can POST back additional artifact files, which the runner then includes in the GitLab artifact archive.
Changes:
- Introduce
UploadServer/UploadStore/JobArtifacts(RAII) to register per-job upload keys and collect uploaded bytes. - Extend template variables with
runner.ARTIFACT_UPLOAD_URLand include uploaded artifacts inget_uploadable_files(). - Start an axum server when
LAVA_ARTIFACT_UPLOAD_BASE_URLis set; bind address controlled viaLAVA_ARTIFACT_UPLOAD_LISTEN_ADDR.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
Cargo.toml |
Enables full Tokio features and adds axum dependency for the HTTP server. |
Cargo.lock |
Locks new dependency graph introduced by axum/tokio feature changes. |
src/upload.rs |
Implements per-job upload registration + in-memory artifact collection via shared store. |
src/main.rs |
Wires upload URL into templating, collects uploaded artifacts, and spawns the axum server. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Called by the axum handler to store an uploaded file. | ||
| pub fn upload_file(&mut self, key: &str, path: &str, data: Bytes) { | ||
| if let Some(ja) = self.jobs.get(key) { | ||
| ja.lock().unwrap().upload_artifact(path, data); | ||
| } else { |
There was a problem hiding this comment.
upload_file() accepts an arbitrary path from the HTTP route and stores it verbatim. This allows .., absolute paths, path separators, and name collisions (e.g., overwriting *_log.yaml) inside the artifact archive. Validate/sanitize path to a single safe filename (and/or a constrained relative path) before storing.
There was a problem hiding this comment.
@copilot apply changes based on this feedback -- artifact overwrites are fine, but paths should be sanatized or even enforced to be normalized
There was a problem hiding this comment.
Fixed in a9c59c3. Added sanitize_artifact_path() which iterates std::path::Path::components() and rejects any .., absolute/prefix/root components, and empty results. Only Normal components (and skipped .) are kept, then joined with /.
src/main.rs
Outdated
| async fn upload_artifact( | ||
| State(store): State<Arc<Mutex<UploadStore>>>, | ||
| Path((key, path)): Path<(String, String)>, | ||
| body: Bytes, | ||
| ) { |
There was a problem hiding this comment.
Uploads are fully buffered into memory (body: Bytes) and then retained in a per-job map with no size/count limits. A malicious or misconfigured job can OOM the runner by POSTing large/many artifacts. Add request body limits (e.g., axum DefaultBodyLimit / tower limit layer) and enforce a per-job/per-runner quota (or stream artifacts to disk).
There was a problem hiding this comment.
@copilot apply changes based on this feedback This should use a cache for bigger files, but also enforce an upload limit per job. iirc the gitlab-runner crate already does that for artifact downloads
There was a problem hiding this comment.
Fixed in a9c59c3. Files >= 1 MB are spilled to a NamedTempFile (via tempfile); smaller files stay as Bytes in memory. JobArtifactsInner tracks total_bytes (accounting for overwrites by subtracting the old size before adding the new) and rejects uploads that would exceed 1 GB per job. A DefaultBodyLimit::max(ARTIFACT_JOB_LIMIT) is also applied to the axum router to enforce the limit at the HTTP layer.
| async fn upload_artifact( | ||
| State(store): State<Arc<Mutex<UploadStore>>>, | ||
| Path((key, path)): Path<(String, String)>, | ||
| body: Bytes, | ||
| ) { | ||
| store.lock().unwrap().upload_file(&key, &path, body); | ||
| } |
There was a problem hiding this comment.
The upload handler always returns 200 OK and UploadStore::upload_file silently ignores unknown job keys. This can make artifact loss hard to detect/debug. Consider returning an explicit status (e.g., 404 for unknown key, 400 for invalid path) and logging the key/path together for traceability.
There was a problem hiding this comment.
Fixed in a9c59c3. upload_artifact now returns impl IntoResponse with: 200 OK on success, 400 Bad Request for an invalid/unsafe path, 404 Not Found for an unknown job key, 413 Payload Too Large when the per-job limit is exceeded, and 500 Internal Server Error on I/O errors.
Agent-Logs-Url: https://github.com/collabora/lava-gitlab-runner/sessions/c13029b6-3fd3-44cc-ac99-b9bf964efe56 Co-authored-by: sjoerdsimons <22603932+sjoerdsimons@users.noreply.github.com>
…HTTP status codes Agent-Logs-Url: https://github.com/collabora/lava-gitlab-runner/sessions/439b5174-50be-4263-8ccd-e88b204f9251 Co-authored-by: sjoerdsimons <22603932+sjoerdsimons@users.noreply.github.com>
Reworks PR #17 against current
main(clap, LazyLock, async_trait, etc.) and addresses all 7 review comments. LAVA jobs can POST artifact files to an ephemeral per-job URL; the runner collects them and includes them in the GitLab artifact archive.Design
UploadServeris created inmain()and threaded intoRunvia closure captureJobArtifactsis an RAII handle; dropping it removes the job key fromUploadStoreautomaticallyArc<Mutex<JobArtifacts>>scatter — the HTTP handler sends uploads throughUploadStore, which holds the inner state map;RunownsJobArtifactsdirectlyLAVA_ARTIFACT_UPLOAD_BASE_URL(full URL); server only starts if set. OptionalLAVA_ARTIFACT_UPLOAD_LISTEN_ADDR(default0.0.0.0:0){id}_artifacts/prefix (artifacts only come fromsubmitjobs)sanitize_artifact_path():.., absolute paths, and empty paths are rejected; redundant.components are strippedNamedTempFileto avoid large in-memory allocations; smaller files remain asBytesDefaultBodyLimitis applied at the axum router level400 Bad Requestfor invalid paths,404 Not Foundfor unknown job keys,413 Payload Too Largewhen the per-job limit is exceeded, and500 Internal Server Erroron I/O failuresChanges
Cargo.tomlaxum = "0.8", enabletokio = { features = ["full"] }src/upload.rs(new)UploadServer— ownsbase_urlandArc<Mutex<UploadStore>>;add_new_job()returns an ownedJobArtifactsUploadStore— shared with axum handler viaState; routes uploaded bytes to the right job by key; validates and sanitizes the path before storingJobArtifacts— RAII;Dropremoves key from storeArtifactFile—Arc-wrapped enum:Memory(Bytes)for small files,Disk(NamedTempFile)for large files;Cloneis cheap (ref-count only)sanitize_artifact_path()— rejects unsafe path components; returns a normalized/-joined relative pathARTIFACT_JOB_LIMIT(1 GB) — enforced inJobArtifactsInner; overwrites subtract the old file size before checkingsrc/main.rsTransformVariablesgains arunnermap → exposes{{ runner.ARTIFACT_UPLOAD_URL }}in LAVA job YAML templatesLavaUploadableFileType::Artifact { path, file }variant added (storesArtifactFile);get_path()returns bare path;get_data()opens a reader without an extra heap copyLavaUploadableFile.storeis nowOption<Arc<AvailableArtifactStore>>(Nonefor artifact files)Rungainsupload_server: Option<Arc<Mutex<UploadServer>>>andartifacts: Option<JobArtifacts>transform()takesupload_url: &str;command()submit branch registers a job slot and stores the handleget_uploadable_files()appends artifact entries collected fromJobArtifactsupload_artifactaxum handler receivesState<Arc<Mutex<UploadStore>>>— no global; returns appropriate HTTP status codesmain()spawns upload server only when env var is present; usestokio::net::TcpListener::bind+axum::serve; appliesDefaultBodyLimitOriginal prompt
This is a reworked version of PR #17 ("Allow LAVA jobs to pass artifacts back to the runner") that:
mainbranch (which has diverged significantly — it now usesclapinstead ofstructopt,LazyLockinstead oflazy_static,async_trait, newer dependency versions, etc.)Background
The feature adds a lightweight axum HTTP server to
lava-gitlab-runnerso that LAVA jobs can POST artifact files back to the runner, which then includes them in the GitLab artifact archive. The LAVA job learns the upload URL from therunner.ARTIFACT_UPLOAD_URLHandlebars template variable.Files to change
Cargo.tomlAdd the following dependencies (compatible with the current version pinning style in
main):Do not add
local-ip-address(see review comment below).src/upload.rs(new file)Implement the upload server with the following design changes vs PR #17:
Review comment addressed: "why is a periodic cleanup required? or rather can't the registration be dropped whenever the relevant jobs get dropped? Also I'm not a big fan of scattering
Arc<Mutex<T>>around"Design:
UploadServerholdsbase_url: Option<Url>(a full URL, not host+port)UploadServer::add_new_job()returns aJobArtifactsdirectly (owned, notArc<Mutex<...>>)JobArtifactsis the registration object held byRun. WhenRundrops it, the LAVA job key is automatically gone — no separate cleanup needed.tokio::sync::Mutex<BTreeMap<String, tokio::sync::mpsc::Sender<(String, Bytes)>>>insideUploadServerso the HTTP handler can send files to the rightJobArtifactsvia a channel. This avoidsArc<Mutex<JobArtifacts>>scatter.DashMaporArc<Mutex<BTreeMap<String, Weak<...>>>>— keep it simple but clean.Actually the cleanest design that avoids scattering
Arc<Mutex<T>>: