Skip to content

SOLR-16458: Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS#4171

Open
epugh wants to merge 10 commits intoapache:mainfrom
epugh:copilot/migrate-node-health-api
Open

SOLR-16458: Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS#4171
epugh wants to merge 10 commits intoapache:mainfrom
epugh:copilot/migrate-node-health-api

Conversation

@epugh
Copy link
Contributor

@epugh epugh commented Feb 28, 2026

Migrates NodeHealthAPI — the last node-level V2 API still using Solr's homegrown @EndPoint annotation — to standard JAX-RS, following the same pattern as NodeLogging, GetPublicKey, etc.

Design

The logic stays in HealthCheckHandler (minimising diff surface). NodeHealthAPI is a thin JAX-RS wrapper (~60 lines) that delegates entirely to it.

Key changes

  • solr/api — New NodeHealthApi interface (@Path, @GET, @Operation) and NodeHealthResponse model (status, message, num_cores_unhealthy)
  • NodeHealthAPI — Replaces @EndPoint with JAX-RS; injects CoreContainer, delegates to HealthCheckHandler
  • HealthCheckHandler — Logic unchanged; adds public NodeHealthResponse checkNodeHealth(Boolean, Integer) as the shared entry point for both v1 (handleRequestBody) and v2 (NodeHealthAPI); switches to getJerseyResources() / empty getApis()
  • V2NodeAPIMappingTest — Removes the now-obsolete @EndPoint/ApiBag routing test for health
  • NodeHealthAPITest — New Mockito unit tests for the API class
  • NodeHealthAPITest2 — New mock-free integration tests: cloud-mode via real MiniSolrCloudCluster, legacy mode via embedded CoreContainer built from NodeConfig
  • implicit-requesthandlers.adoc — Health section now links to both HealthCheckHandler (v1) and NodeHealthAPI (v2) javadocs

Copilot AI and others added 4 commits February 22, 2026 12:52
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…ef guide

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
@epugh
Copy link
Contributor Author

epugh commented Feb 28, 2026

Some outstanding questions: 1) there is a more mock and less mock versions of the same test. Which do we prefer? 2) Does this seem like a reasonable pattern for the conversion?

@epugh
Copy link
Contributor Author

epugh commented Feb 28, 2026

I wish I didnt' have TWO ways of writing tests, one for cloud and one for standalone... sigh.

@JsonProperty public String message;

@JsonProperty("num_cores_unhealthy")
public Long numCoresUnhealthy;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long isn't warranted; lets just do Integer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this did start changing one of the existing HealthCheckHandler methods... Rerunning tests now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long is important - it allows the value to be null and omitted from responses entirely. That's not just a theoretical distinction either - the code in HealthCheckHandler sets this conditionally today!

      if (unhealthyCores > 0) {
        rsp.add(STATUS, FAILURE);
        rsp.add("num_cores_unhealthy", unhealthyCores);
        rsp.setException(
            new SolrException(
                SolrException.ErrorCode.SERVICE_UNAVAILABLE,
                unhealthyCores
                    + " out of "
                    + coreContainer.getNumAllCores()
                    + " replicas are currently initializing or recovering"));
        return;
      }

If @dsmiley has a good reason why it's not warranted in this case, then ignore me. But in general I've been using Long, Integer, Float, etc. in these Response objects any time that it's conceivable that a response might omit the value entirely, which seems true in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think I should back ouit the change to LOng... One thing I don't like is that the call to findUnhealthyCores now has a return Math.toIntExact which seems messy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think I should back ouit the change to LOng

Kindof, yes, that's why I mentioned it haha 😛. You're welcome to disagree of course, or maybe David will offer a reason why he thought it was unnecessary. But that's my 2c.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only point was to use Integer and not Long. I wasn't arguing for a primitive or anything else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the use of return Math.toIntExact? Makes me start to think that whatever else is returning longs should be converted to ints too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stream.count() returns a long. It doesn't mean that this particular stream would ever actually need a long.

@epugh
Copy link
Contributor Author

epugh commented Mar 1, 2026

tests all pass!

@epugh epugh changed the title Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS; add ref guide link Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS Mar 2, 2026
@epugh epugh removed the no-changelog label Mar 3, 2026
@gerlowskija
Copy link
Contributor

@epugh - this should probably be attached to one of the v2 JIRA tickets or another. Maybe https://issues.apache.org/jira/browse/SOLR-16458?

@epugh
Copy link
Contributor Author

epugh commented Mar 3, 2026

@epugh - this should probably be attached to one of the v2 JIRA tickets or another. Maybe https://issues.apache.org/jira/browse/SOLR-16458?

Sure... I suppose I could be crosslinking all of these to various JIRAs?

@epugh epugh changed the title Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS SOLR-16458: Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS Mar 3, 2026
@@ -0,0 +1,7 @@
title: Migrate NodeHealthAPI to JAX-RS. NodeHealthAPI now has OpenAPI and SolrJ support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] I'd try to focus this changelog more on what matters to the user. Specifically - what is the new SolrJ class for and where can a user find it?

i.e.

SolrJ now offers a SolrRequest class allowing users to perform single-node healthchecks: NodeApi.CheckNodeHealth

Something like that...

@@ -0,0 +1,7 @@
title: Migrate NodeHealthAPI to JAX-RS. NodeHealthAPI now has OpenAPI and SolrJ support.
type: changed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] From a user's perspective - this is "added". There's a new SolrJ capability they didn't have before!

- name: Eric Pugh
links:
- name: PR#4171
url: https://github.com/apache/solr/pull/4171
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] Link to JIRA

@Operation(
summary = "Determine the health of a Solr node.",
tags = {"node"})
NodeHealthResponse checkNodeHealth(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] By default, the name of the generated SolrRequest class is: <tagVal>Api.<MethodName>.

We can override that if desired by specifying an operationId parameter in the @Operation annotation, in which case the pattern becomes: <tagVal>Api.<OperationId>

Do you like NodeApi.CheckNodeHealth as a classname, or would NodeApi.Healthcheck or something similar be better?

/** Response body for the '/api/node/health' endpoint. */
public class NodeHealthResponse extends SolrJerseyResponse {

@JsonProperty public String status;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Does status have certain expected values? If so, those can be documented using a @Schema annotation, or the whole field can be made an 'enum' which is probably even better.

* mocks.
*
* <p>Cloud-mode tests use a real {@link org.apache.solr.cloud.MiniSolrCloudCluster} and get a
* {@link CoreContainer} directly from a {@link JettySolrRunner}. Legacy (standalone) mode tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Do we have other tests that do standalone testing within a SolrCloudTestCase?

It feels weird conceptually. And in practical terms SolrCloudTestCase does some work that makes it much slower on a per-test basis than our other base classes. Doing standalone testing in a SolrCloudTestCase is going to end up paying that runtime cost for no reason.

public void testCloudMode_RequireHealthyCoresReturnOkWhenAllCoresHealthy() {
CoreContainer coreContainer = cluster.getJettySolrRunner(0).getCoreContainer();

// requireHealthyCores=true should succeed on a node with no unhealthy cores
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] You haven't actually created any cores!!

Can you create a collection or something that'll cause this test to actually exercise the per-core logic currently in HealthcheckHandler?

import org.junit.Test;

/**
* Integration tests for {@link NodeHealthAPI} that use real Solr instances instead of Mockito
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] If these are intended to be integration tests, why aren't they using the HTTP APIs the way a user would?

This would be an excellent place IMO to make use of the generated SolrRequest class. Doing so would kill two birds with one stone: making the tests more of a true integration test, and getting some validation on the SolrJ code as well.

}
}

// ---- Legacy (standalone, non-ZooKeeper) mode tests ----
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Are we calling this "Legacy" now? 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG... remove this comment

v2: `api/node/health` |{solr-javadocs}/core/org/apache/solr/handler/admin/HealthCheckHandler.html[HealthCheckHandler] |
v2: `api/node/health` |v1: {solr-javadocs}/core/org/apache/solr/handler/admin/HealthCheckHandler.html[HealthCheckHandler]

v2: {solr-javadocs}/core/org/apache/solr/handler/admin/api/NodeHealthAPI.html[NodeHealthAPI] |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Is it the implementation Javadocs we want to point people to here, or would the solr/api interface docs be more helpful?

Or more broadly - is there much value even in pointing to either Javadoc on the v2 side? HealthcheckHandler has a nice good blurb, but neither NodeHealthAPI nor NodeHealthApi have much of anything that's worth pointing a user at IMO...

@dsmiley
Copy link
Contributor

dsmiley commented Mar 3, 2026

Just want to give kudos to Jason's thoroughly amazing code review. Really shows how important it is that we have the right reviewers on the right issues. Hopefully after this issue is done, similar issues can follow the same lessons/advise. Assuming LLM is doing the bulk of the work, it can be pointed specifically at this PR to learn the key aspects to successfully do similar transitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:api documentation Improvements or additions to documentation tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants