Skip to content

feat(mistralai): surface citation metadata from chat responses#37008

Open
JulienRabault wants to merge 1 commit into
langchain-ai:masterfrom
JulienRabault:feat/mistral-citation-metadata-36427
Open

feat(mistralai): surface citation metadata from chat responses#37008
JulienRabault wants to merge 1 commit into
langchain-ai:masterfrom
JulienRabault:feat/mistral-citation-metadata-36427

Conversation

@JulienRabault
Copy link
Copy Markdown

@JulienRabault JulienRabault commented Apr 25, 2026

Fixes #36427

Changes

  • Extract Mistral reference content chunks into AIMessage.response_metadata["citations"].
  • Preserve non-streaming AIMessage.content as a string when Mistral returns typed text / reference chunks.
  • Extract reference chunks from streaming delta.content into chunk response_metadata["citations"].
  • Keep streaming content unchanged; no content_blocks conversion or citation offset calculation is added.

Why

Mistral citation responses can return content as typed chunks, including reference chunks with reference_ids.

Before this change, ChatMistralAI kept only message content and dropped the citation metadata. RAG users could not map Mistral answers back to source documents through LangChain.

References:

Review notes

  • The raw Mistral reference chunks are stored in response_metadata["citations"].
  • This PR does not convert citations into v1 content_blocks annotations.
  • This PR does not compute citation offsets.

AI assistance disclosure: AI tooling helped prepare this change; the submitted diff was reviewed and tested before submission.

@github-actions github-actions Bot added feature For PRs that implement a new feature; NOT A FEATURE REQUEST integration PR made that is related to a provider partner package integration mistralai `langchain-mistralai` package issues & PRs size: S 50-199 LOC labels Apr 25, 2026
@github-actions

This comment has been minimized.

@JulienRabault JulienRabault force-pushed the feat/mistral-citation-metadata-36427 branch from f0f0bd7 to c779973 Compare April 25, 2026 23:01
Copy link
Copy Markdown
Collaborator

@ccurme ccurme (ccurme) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ideal format for citations is:

  • Mistral reference blocks in .content (untyped)
  • add translation to and from the standard langchain TextContentBlock in langchain_mistralai/_compat.py. This will translate the reference blocks into a text content block with an Annotation.

If it's possible to do this in a non-breaking way (e.g., if users need to opt into receiving references), this is preferred. It might be OK to make this change in a 1.2 release.

Otherwise, we can add to additional_kwargs (which is a more typical place for provider message data than response_metadata). The issue with this is we lose the sequencing of the reference in the text blocks.

@ccurme ccurme (ccurme) self-assigned this Apr 26, 2026
@JulienRabault
Copy link
Copy Markdown
Author

Thanks, that makes sense.

I held off on converting the Mistral reference blocks into annotations because I don't see text offsets in the provider response, and I don't want to invent positions.

I'll update the PR to preserve the native reference blocks in .content and add the _compat.py translation to standard TextContentBlock / Annotation.

If keeping those blocks in .content is too breaking for this release, I can move the provider-native data to additional_kwargs instead. Do you want me to take that fallback path here?

@JulienRabault
Copy link
Copy Markdown
Author

any news ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external feature For PRs that implement a new feature; NOT A FEATURE REQUEST integration PR made that is related to a provider partner package integration mistralai `langchain-mistralai` package issues & PRs new-contributor size: S 50-199 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ChatMistralAI: citation metadata from Mistral API response is silently dropped

2 participants