Conversation Memory Store
Why?
Multi-turn AI applications — chatbots, copilots, customer support agents — need to remember conversation history. Without persistent memory, every interaction starts from scratch and the LLM loses context.
Storing conversation memory in Infinispan provides:
- Persistence across restarts — conversations survive application redeployments and crashes.
- Horizontal scalability — any application instance can access any user’s conversation history.
- TTL-based cleanup — old conversations are automatically evicted, no manual garbage collection needed.
- Cross-site replication — global applications can replicate conversation state across data centers.
- Sub-millisecond access — in-memory storage means loading conversation history adds negligible latency.
- Flexible storage — store conversations as JSON, Protobuf, or plain text with Infinispan’s encoding support.
Quarkus + LangChain4j
Add the Infinispan client dependency:
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-infinispan-client</artifactId>
</dependency>
Configure in application.properties:
quarkus.infinispan-client.hosts=localhost:11222
quarkus.infinispan-client.username=admin
quarkus.infinispan-client.password=password
Implement a ChatMemoryStore backed by Infinispan:
@ApplicationScoped
public class InfinispanChatMemoryStore implements ChatMemoryStore {
@Inject
RemoteCacheManager cacheManager;
RemoteCache<String, String> getCache() {
return cacheManager.getCache("chat-memory");
}
@Override
public List<ChatMessage> getMessages(Object memoryId) {
String json = getCache().get(memoryId.toString());
return json == null ? List.of()
: ChatMessageDeserializer.messagesFromJson(json);
}
@Override
public void updateMessages(Object memoryId, List<ChatMessage> messages) {
getCache().put(memoryId.toString(),
ChatMessageSerializer.messagesToJson(messages),
24, TimeUnit.HOURS);
}
@Override
public void deleteMessages(Object memoryId) {
getCache().remove(memoryId.toString());
}
}
The 24-hour TTL means inactive conversations are automatically cleaned up. Wire it into your AI service:
@RegisterAiService(chatMemoryProviderSupplier = RegisterAiService.BeanChatMemoryProviderSupplier.class)
public interface CustomerSupportAgent {
@SystemMessage("You are a helpful customer support agent.")
String chat(@MemoryId String sessionId, @UserMessage String message);
}
Spring AI
Add the Infinispan Spring Boot starter:
<dependency>
<groupId>org.infinispan</groupId>
<artifactId>infinispan-spring-boot3-starter-remote</artifactId>
</dependency>
Configure in application.properties:
infinispan.remote.server-list=localhost:11222
infinispan.remote.auth-username=admin
infinispan.remote.auth-password=password
Implement a ChatMemory backed by Infinispan:
@Service
public class InfinispanChatMemory implements ChatMemory {
private final RemoteCache<String, String> cache;
private final ObjectMapper objectMapper = new ObjectMapper();
public InfinispanChatMemory(RemoteCacheManager cacheManager) {
this.cache = cacheManager.getCache("chat-memory");
}
@Override
public void add(String conversationId, List<Message> messages) {
List<Message> existing = get(conversationId, Integer.MAX_VALUE);
existing.addAll(messages);
cache.put(conversationId, serialize(existing), 24, TimeUnit.HOURS);
}
@Override
public List<Message> get(String conversationId, int lastN) {
String json = cache.get(conversationId);
if (json == null) return new ArrayList<>();
List<Message> messages = deserialize(json);
if (messages.size() <= lastN) return messages;
return new ArrayList<>(messages.subList(messages.size() - lastN, messages.size()));
}
@Override
public void clear(String conversationId) {
cache.remove(conversationId);
}
}
Use it with a ChatClient:
@RestController
public class ChatController {
private final ChatClient chatClient;
public ChatController(ChatClient.Builder builder,
InfinispanChatMemory chatMemory) {
this.chatClient = builder
.defaultAdvisors(new MessageChatMemoryAdvisor(chatMemory))
.build();
}
@PostMapping("/chat")
public String chat(@RequestParam String sessionId,
@RequestParam String message) {
return chatClient.prompt()
.user(message)
.advisors(a -> a.param("chat_memory_conversation_id", sessionId))
.call()
.content();
}
}
LangChain4j (standalone Java)
RemoteCacheManager cacheManager = new RemoteCacheManager(
new ConfigurationBuilder()
.addServer().host("localhost").port(11222)
.security().authentication()
.username("admin").password("password")
.build());
ChatMemoryStore memoryStore = new InfinispanChatMemoryStore(cacheManager);
ChatMemory chatMemory = MessageWindowChatMemory.builder()
.id("user-session-123")
.maxMessages(20)
.chatMemoryStore(memoryStore)
.build();
Assistant assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.chatMemory(chatMemory)
.build();
assistant.chat("Hi, I need help with my order");
assistant.chat("The order number is 12345");
// The assistant remembers the previous message
LangChain Python
Use Infinispan via the RESP (Redis) protocol with LangChain’s Redis-based memory:
pip install langchain-community redis
from langchain.memory import ConversationBufferMemory
from langchain_community.chat_message_histories import RedisChatMessageHistory
# Connect to Infinispan's RESP endpoint
message_history = RedisChatMessageHistory(
session_id="user-session-123",
url="redis://localhost:6379"
)
memory = ConversationBufferMemory(
chat_memory=message_history,
return_messages=True
)
# Use in a conversation chain
chain = ConversationChain(
llm=llm,
memory=memory
)
chain.predict(input="Hi, I need help with my order")
chain.predict(input="The order number is 12345")
# Full conversation history is maintained in Infinispan
Start Infinispan with RESP support:
docker run -p 11222:11222 -p 6379:6379 quay.io/infinispan/server -c infinispan-resp.xml
Cache configuration for conversations
Create a cache optimized for conversation storage with TTL and memory limits:
{
"distributed-cache": {
"encoding": {
"media-type": "application/x-protostream"
},
"expiration": {
"lifespan": "86400000"
},
"memory": {
"max-count": "100000"
}
}
}
This configuration:
- Distributes conversations across cluster nodes for scalability.
- Expires conversations after 24 hours of inactivity.
- Limits the cache to 100,000 sessions to control memory usage.
Requirements
- Infinispan Server 15.0 or later.
- For RESP protocol: start Infinispan with
infinispan-resp.xmlconfiguration. - An LLM provider (OpenAI, Ollama, etc.).


