When working with large language models (LLMs), one of the key challenges is maintaining context across multiple interactions.
Out of the box, LLMs treat every prompt as a stateless request – meaning they don’t “remember” previous messages.
That’s where chat memory comes in.
Spring AI makes it easy to build conversational applications by adding memory support to your chat interactions.
1.What is Chat Memory?
Chat memory is the mechanism that allows your application to keep track of past interactions in a conversation.
Instead of sending isolated prompts to the model, you provide it with a conversation history, enabling the AI to respond in a way that feels more natural and contextual.
Example:
- You: Hello! My name is Tom, nice to meet you.
- AI: Hi Tom! How are you today?
- You: I’m doing well. How about you?
- AI: I’m glad to hear that! I’m doing great too. So Tom, tell me how can i help you ?
Without memory, the model wouldn’t able to remember the user’s name and to conduct a human-like conversation.
2.Using Chat Memory in a Chat Controller
Here’s a simple example of enabling chat memory in a REST API:
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.openai.OpenAiChatClient;
import org.springframework.web.bind.annotation.*;
@RestController
@RequestMapping("/chat")
public class ChatController {
private final OpenAiChatClient chatClient;
private final ChatMemory chatMemory = new InMemoryChatMemory();
public ChatController(OpenAiChatClient chatClient) {
this.chatClient = chatClient;
}
@PostMapping
public String chat(@RequestBody String userMessage) {
// Store the user’s message in memory
chatMemory.addUserMessage(userMessage);
// Generate the AI’s response using memory
ChatResponse response = chatClient.call(chatMemory.getMessages());
String responseText = response.getResult().getOutput().getContent();
// Store the AI’s response in memory
chatMemory.addAssistantMessage(responseText);
return responseText;
}
}
Here’s what happens:
- When an user messaage is sent, we put it into the chat memory.
- We call the model with the full conversation history.
- After getting the response, we add it into the chat memory as an assistant message.
3.Memory Types, Capacity, Limits, and Eviction
Spring AI supports different memory backends:
- InMemoryChatMemory: Simple, stored in-memory (good for demos).
- Database-backed memory: Store conversations persistently using Redis, PostgreSQL, etc.
- Custom memory: Implement the ChatMemory interface for your own strategy.
Example: Using JDBC Repository
import org.springframework.ai.chat.memory.repository.JdbcChatMemoryRepository;
JdbcChatMemoryRepository repository = new JdbcChatMemoryRepository(dataSource);
ChatMemory dbMemory = new MessageWindowChatMemory(repository, 10); // keep last 10 messages per conversation
Spring AI uses MessageWindowChatMemory by default, which maintains a sliding window of recent messages:
- Default size: 20 messages.
- Eviction: Once the limit is reached, the oldest user/assistant messages are dropped.
- System messages: Always preserved. If a new system message is added, older system messages are removed so only the most recent one remains.
Example: Limiting memory to 5 messages
import org.springframework.ai.chat.memory.MessageWindowChatMemory;
ChatMemory limitedMemory = MessageWindowChatMemory.builder()
.maxMessages(5) // keep only the last 5 messages
.build();
In this case, after the 6th user message, the very first message will be evicted.
4.Messages
In Spring AI, messages are categorized into three roles:
- System messages – Define the behavior, rules, or personality of the AI assistant.
- User messages – Represent the input from the user.
- Assistant messages – Represent the AI’s responses.
A system message acts as an instruction to the model, guiding how it should respond. It controls style, constraints, or high-level rules.
Example: Using a System Message
import org.springframework.ai.chat.messages.SystemMessage;
import org.springframework.ai.chat.messages.UserMessage;
SystemMessage system = new SystemMessage("You are a helpful assistant that answers like a Senior Software Engineer.");
UserMessage user = new UserMessage("What is Spring AI?");
String response = chatClient.call(List.of(system, user));
System.out.println(response);
- The system message tells the model to answer like Senior Software Engineer.
- The user message provides the actual question.
Behavior of System Messages:
- Always preserved: They are not evicted when memory is full.
- Most recent only: If a new system message is added, previous system messages are discarded.
Together with user messages and assistant messages, they form the structured conversation Spring AI sends to the model.
A conversation history usually looks like this:
List<ChatMessage> conversation = List.of(
new SystemMessage("You are a helpful assistant."),
new UserMessage("What is Spring AI?"),
new AssistantMessage("Spring AI is a project that makes it easier to use LLMs in Java."),
new UserMessage("Can you explain chat memory?"),
// Assistant response will be appended here
);
Conclusion
With chat memory, Spring AI takes a big step toward building real conversational applications.
Whether you’re creating a chatbot, customer support assistant, or AI-powered tutor, memory ensures conversations feel smooth and engaging.