Creating a Java-Based GenAI App Using Docker

NewsCreating a Java-Based GenAI App Using Docker

When it comes to delving into the realm of Generative AI (GenAI), many might instinctively think of Python as the go-to language. However, for those already familiar with Java, there’s no requirement to switch gears entirely. The Java ecosystem is equipped with a suite of tools and libraries that streamline the creation of GenAI applications, ensuring they are accessible and efficient to develop.

In this informative piece, we’ll explore how to construct a GenAI application using Java. We will guide you through a step-by-step demonstration, showcasing how Retrieval-Augmented Generation (RAG) can enhance model responses by utilizing Spring AI and Docker tools. Spring AI integrates with numerous model providers, engaging both chat and embedding models, along with vector databases. For our demonstration, we’ll employ OpenAI and Qdrant modules from the Spring AI project, leveraging the built-in support for seamless integration. Further, by using Docker Model Runner, we can run AI models locally via an OpenAI-compatible API, offering a local alternative to cloud-hosted models. Automated testing will be conducted using Testcontainers and Spring AI’s tools, ensuring the responses from Large Language Models (LLMs) are contextually accurate. Grafana will be used for observability to ensure our app functions as intended.

Getting Started

To begin constructing a sample application, visit Spring Initializr and select the following dependencies: Web, OpenAI, Qdrant Vector Database, and Testcontainers. The application will feature two endpoints: "/chat," which directly interacts with the model, and "/rag," which provides the model with additional context from documents stored in a vector database.

Configuring Docker Model Runner

Enable Docker Model Runner in your Docker Desktop or Docker Engine, as detailed in the official documentation. Next, pull the following models using Docker commands:

  1. ai/llama3.1 – a chat model
  2. ai/mxbai-embed-large – an embedding model

    These models are hosted on Docker Hub under the "ai" namespace. While selecting a specific tag for the model might offer different quantizations, the default option is generally a suitable starting point.

    Building the GenAI App

    Let’s create a ChatController under src/main/java/com/example, which will serve as our entry point for interaction with the chat model:

    java<br /> @RestController<br /> public class ChatController {<br /> <br /> private final ChatClient chatClient;<br /> <br /> public ChatController(ChatModel chatModel) {<br /> this.chatClient = ChatClient.builder(chatModel).build();<br /> }<br /> <br /> @GetMapping("/chat")<br /> public String generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {<br /> return this.chatClient.prompt().user(message).call().content();<br /> }<br /> <br /> }<br />

    • ChatClient is an interface facilitating operations to interact with the model. The actual model value (which model to use) will be injected via configuration properties.
    • If no message query parameter is provided, the model defaults to telling a joke.

      Configure the application to point to Docker Model Runner and employ the "ai/llama3.1" model by adding the following properties to src/test/resources/application.properties:

      plaintext<br /> spring.ai.openai.base-url=http://localhost:12434/engines<br /> spring.ai.openai.api-key=test<br /> spring.ai.openai.chat.options.model=ai/llama3.1<br />

      The spring.ai.openai.api-key is a requirement by the framework, but any value can be used since it is not needed for Docker Model Runner.

      Launch the application by running ./mvnw spring-boot:test-run or ./gradlew bootTestRun and inquire about Testcontainers:

      shell<br /> http :8080/chat message=="What’s testcontainers?"<br />

      Below is the response from the LLM (ai/llama3.1):

      "Testcontainers is a fantastic and increasingly popular library for local testing with containers. It provides a way to run real, fully functional containerized services directly within your tests, leading to more realistic and reliable test results."

      Observing Mistakes and Hallucinations

      The LLM’s response contains inaccuracies, such as references to non-existent classes or incorrect URLs. This highlights the need for providing models with curated context to improve response accuracy.

      Enhancing Response Accuracy with RAG

      We can enhance the model’s response by providing it with curated context. Let’s create a RagController to retrieve documents from a vector search database:

      java<br /> @RestController<br /> public class RagController {<br /> <br /> private final ChatClient chatClient;<br /> private final VectorStore vectorStore;<br /> <br /> public RagController(ChatModel chatModel, VectorStore vectorStore) {<br /> this.chatClient = ChatClient.builder(chatModel).build();<br /> this.vectorStore = vectorStore;<br /> }<br /> <br /> @GetMapping("/rag")<br /> public String generate(@RequestParam(value = "message", defaultValue = "What's Testcontainers?") String message) {<br /> return callResponseSpec(this.chatClient, this.vectorStore, message).content();<br /> }<br /> <br /> static ChatClient.CallResponseSpec callResponseSpec(ChatClient chatClient, VectorStore vectorStore,<br /> String question) {<br /> QuestionAnswerAdvisor questionAnswerAdvisor = QuestionAnswerAdvisor.builder(vectorStore)<br /> .searchRequest(SearchRequest.builder().topK(1).build())<br /> .build();<br /> return chatClient.prompt().advisors(questionAnswerAdvisor).user(question).call();<br /> }<br /> }<br />

      Ingesting Documents into the Vector Database

      To provide the model with context, we need to load documents into the vector database. Create an IngestionConfiguration class in src/test/java/com/example:

      java<br /> @TestConfiguration(proxyBeanMethods = false)<br /> public class IngestionConfiguration {<br /> <br /> @Value("classpath:/docs/testcontainers.txt")<br /> private Resource testcontainersDoc;<br /> <br /> @Bean<br /> ApplicationRunner init(VectorStore vectorStore) {<br /> return args -> {<br /> var javaTextReader = new TextReader(this.testcontainersDoc);<br /> javaTextReader.getCustomMetadata().put("language", "java");<br /> <br /> var tokenTextSplitter = new TokenTextSplitter();<br /> var testcontainersDocuments = tokenTextSplitter.apply(javaTextReader.get());<br /> <br /> vectorStore.add(testcontainersDocuments);<br /> };<br /> }<br /> }<br />

      The file testcontainers.txt in the src/test/resources/docs directory should contain relevant information about Testcontainers. For practical applications, a broader document collection is recommended.

      Add properties to src/test/resources/application.properties:

      plaintext<br /> spring.ai.openai.embedding.options.model=ai/mxbai-embed-large<br /> spring.ai.vectorstore.qdrant.initialize-schema=true<br /> spring.ai.vectorstore.qdrant.collection-name=test<br />

      The ai/mxbai-embed-large model is used to create embeddings of the documents, which are then stored in the vector search database (Qdrant in this case). Spring AI will initialize the Qdrant schema and use the specified collection name.

      Update the TestDemoApplication Java class to include IngestionConfiguration.class:

      java<br /> public class TestDemoApplication {<br /> <br /> public static void main(String[] args) {<br /> SpringApplication.from(DemoApplication::main)<br /> .with(TestcontainersConfiguration.class, IngestionConfiguration.class)<br /> .run(args);<br /> }<br /> }<br />

      Restart the application and query about Testcontainers again:

      shell<br /> http :8080/rag message=="What’s testcontainers?"<br />

      This time, the response will be more accurate, drawing references from the provided documentation.

      Integration Testing

      Testing is an integral part of software development. Using Testcontainers and Spring AI utilities, we can automate testing of GenAI applications. We can create integration tests to ensure the LLM provides contextually accurate answers using the document data.

      java<br /> @SpringBootTest(classes = { TestcontainersConfiguration.class, IngestionConfiguration.class },<br /> webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)<br /> class RagControllerTest {<br /> <br /> @LocalServerPort<br /> private int port;<br /> <br /> @Autowired<br /> private VectorStore vectorStore;<br /> <br /> @Autowired<br /> private ChatClient.Builder chatClientBuilder;<br /> <br /> @Test<br /> void verifyTestcontainersAnswer() {<br /> var question = "Tell me about Testcontainers";<br /> var answer = retrieveAnswer(question);<br /> <br /> assertFactCheck(question, answer);<br /> }<br /> <br /> private String retrieveAnswer(String question) {<br /> RestClient restClient = RestClient.builder().baseUrl("http://localhost:%d".formatted(this.port)).build();<br /> return restClient.get().uri("/rag?message={question}", question).retrieve().body(String.class);<br /> }<br /> <br /> private void assertFactCheck(String question, String answer) {<br /> FactCheckingEvaluator factCheckingEvaluator = new FactCheckingEvaluator(this.chatClientBuilder);<br /> EvaluationResponse evaluate = factCheckingEvaluator.evaluate(new EvaluationRequest(docs(question), answer));<br /> assertThat(evaluate.isPass()).isTrue();<br /> }<br /> <br /> private List<Document> docs(String question) {<br /> var response = RagController<br /> .callResponseSpec(this.chatClientBuilder.build(), this.vectorStore, question)<br /> .chatResponse();<br /> return response.getMetadata().get(QuestionAnswerAdvisor.RETRIEVED_DOCUMENTS);<br /> }<br /> }<br />

      Automating tests ensures consistency and minimizes errors that can occur with manual testing.

      Observability with Grafana LGTM Stack

      Observability is crucial for understanding application behavior in development and production. By introducing metrics and tracing, we can monitor the application’s performance and ensure it meets design expectations.

      Add the following dependencies to pom.xml:

      “`xml


      org.springframework.boot
      spring-boot-starter-actuator


      io.micrometer
      micrometer-registry-otlp


      io.micrometer
      micrometer-tracing-bridge-otel


      io.opentelemetry
      opentelemetry-exporter-otlp


      org.testcontainers
      grafana
      test

      “`

      Create a `GrafanaContainerConfiguration` under `src/test/java/com/example`:

      “`java
      @TestConfiguration(proxyBeanMethods = false)
      public class GrafanaContainerConfiguration {

      @Bean
      @ServiceConnection
      LgtmStackContainer lgtmContainer() {
      return new LgtmStackContainer(“grafana/otel-lgtm:0.11.4”);
      }
      }
      “`

      Grafana provides a comprehensive Docker image that includes Prometheus, Tempo, and OpenTelemetry Collector services, allowing us to monitor performance metrics and traces effectively.

      Add properties to `src/test/resources/application.properties` to sample 100% of requests:

      “`plaintext
      spring.application.name=demo
      management.tracing.sampling.probability=1
      “`

      Update the `TestDemoApplication` class to include `GrafanaContainerConfiguration.class`:

      “`java
      public class TestDemoApplication {

      public static void main(String[] args) {
      SpringApplication.from(DemoApplication::main)
      .with(TestcontainersConfiguration.class, IngestionConfiguration.class, GrafanaContainerConfiguration.class)
      .run(args);
      }
      }
      “`

      Run the application and perform a request:

      “`shell
      http :8080/rag message==”What’s testcontainers?”
      “`

      Check the logs for Grafana dashboard access details and explore the metrics and traces for insights into application performance.

      ### Conclusion

      The combination of Docker and Spring AI offers a robust and efficient platform for developing GenAI applications. Docker simplifies the initialization of service dependencies, including the Docker Model Runner, which provides an OpenAI-compatible API for local model execution. Testcontainers facilitate rapid integration testing by offering lightweight containers for services and dependencies. Together, Docker and Spring AI enable developers to build sophisticated AI-driven applications efficiently, from development through to production.

      ### Learn More

      To delve deeper into this topic, consider exploring resources on Docker Model Runner, Spring AI, and Testcontainers. These tools provide a strong foundation for creating advanced AI applications within the Java ecosystem.

For more Information, Refer to this article.

Neil S
Neil S
Neil is a highly qualified Technical Writer with an M.Sc(IT) degree and an impressive range of IT and Support certifications including MCSE, CCNA, ACA(Adobe Certified Associates), and PG Dip (IT). With over 10 years of hands-on experience as an IT support engineer across Windows, Mac, iOS, and Linux Server platforms, Neil possesses the expertise to create comprehensive and user-friendly documentation that simplifies complex technical concepts for a wide audience.
Watch & Subscribe Our YouTube Channel
YouTube Subscribe Button

Latest From Hawkdive

You May like these Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.