community[patch]: invoke callback prior to yielding token (llama.cpp) (…

…langchain-ai#19392) **Description:** Invoke callback prior to yielding token for llama.cpp **Issue:** [Callback for on_llm_new_token should be invoked before the token is yielded by the model langchain-ai#16913](langchain-ai#16913) **Dependencies:** None
bechbd · Mar 29, 2024 · 79b89b4 · 79b89b4
1 parent 5a8cce9
commit 79b89b4
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/libs/community/langchain_community/llms/llamacpp.py b/libs/community/langchain_community/llms/llamacpp.py
@@ -344,11 +344,11 @@ def _stream(
                 text=part["choices"][0]["text"],
                 generation_info={"logprobs": logprobs},
             )
-            yield chunk
             if run_manager:
                 run_manager.on_llm_new_token(
                     token=chunk.text, verbose=self.verbose, log_probs=logprobs
                 )
+            yield chunk
 
     def get_num_tokens(self, text: str) -> int:
         tokenized_text = self.client.tokenize(text.encode("utf-8"))