experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling #22793

jordyantunes · 2024-06-12T01:25:32Z

Description: Modified the prompt created by the function create_unstructured_prompt (which is called for LLMs that do not support function calling) by adding conditional checks that verify if restrictions on entity types and rel_types should be added to the prompt. If the user provides a sufficiently large text, the current prompt may fail to produce results in some LLMs. I have first seen this issue when I implemented a custom LLM class that did not support Function Calling and used Gemini 1.5 Pro, but I was able to replicate this issue using OpenAI models.

By loading a sufficiently large text

from langchain_community.llms import Ollama
from langchain_openai import ChatOpenAI, OpenAI
from langchain_core.prompts import PromptTemplate
import re
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_core.documents import Document

with open("texto-longo.txt", "r") as file:
    full_text = file.read()
    partial_text = full_text[:4000]

documents = [Document(page_content=partial_text)] # cropped to fit GPT 3.5 context window

And using the chat class (that has function calling)

chat_openai = ChatOpenAI(model="gpt-3.5-turbo", model_kwargs={"seed": 42})
chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai)
graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents)

It works:

>>> print(graph_from_chat_gpt35[0].nodes)
[Node(id="Jesu, Joy of Man's Desiring", type='Music'), Node(id='Godel', type='Person'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='clever way of encoding the complicated expressions as numbers', type='Concept')]

But if you try to use the non-chat LLM class (that does not support function calling)

openai = OpenAI(
    model="gpt-3.5-turbo-instruct",
    max_tokens=1000,
)
gpt35_transformer = LLMGraphTransformer(llm=openai)
graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents)

It uses the prompt that has issues and sometimes does not produce any result

>>> print(graph_from_gpt35[0].nodes)
[]

After implementing the changes, I was able to use both classes more consistently:

>>> chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai)
>>> graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents)
>>> print(graph_from_chat_gpt35[0].nodes)
[Node(id="Jesu, Joy Of Man'S Desiring", type='Music'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='Godel', type='Person')]
>>> gpt35_transformer = LLMGraphTransformer(llm=openai)
>>> graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents)
>>> print(graph_from_gpt35[0].nodes)
[Node(id='I', type='Pronoun'), Node(id="JESU, JOY OF MAN'S DESIRING", type='Song'), Node(id='larger memory', type='Memory'), Node(id='this nice tree structure', type='Structure'), Node(id='how you can do it all with the numbers', type='Process'), Node(id='JOHANN SEBASTIAN BACH', type='Composer'), Node(id='type of structure', type='Characteristic'), Node(id='that', type='Pronoun'), Node(id='we', type='Pronoun'), Node(id='worry', type='Verb')]

The results are a little inconsistent because the GPT 3.5 model may produce incomplete json due to the token limit, but that could be solved (or mitigated) by checking for a complete json when parsing it.

vercel · 2024-06-12T01:25:36Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		Jul 1, 2024 0:29am

baskaryan · 2024-06-14T02:58:27Z

cc @tomasonjo

tomasonjo · 2024-06-14T03:55:38Z

I don't really understand what was changed. New prompt seems identical to the previous one?

jordyantunes · 2024-06-15T17:33:56Z

What changed is that I made the following sentences only included if the variables node_labels and rel_types aren't None.

# ENTITY TYPES:
Use the following relation types, don't use other relation that is not defined below:
{node_labels}

 # RELATION TYPES:
Below are a number of examples of text and their extracted entities and relationships:
{rel_types}

The current version includes these restrictions even if the user does not provide node_labels or rel_types, resulting in a prompt that specifically says not to include any relations that are not provided, but not providing any examples. This causes some LLMs to return nothing.

tomasonjo · 2024-06-15T18:59:05Z

Ok, looks great. Please fix the linting errors and we can merge it in.

tomasonjo · 2024-06-29T23:47:29Z

@jordyantunes ping

jordyantunes · 2024-07-01T12:01:25Z

I'm sorry for the delay. I'll fix the linting errors today.

tomasonjo · 2024-07-01T16:24:39Z

Thanks! Ping @ccurme

jordyantunes added 2 commits June 12, 2024 00:35

fixed prompt for non-function calling graph llm

c5b53e4

Fixed formatting

Loading
Loading status checks…

64fdf46

jordyantunes marked this pull request as ready for review June 12, 2024 02:06

dosubot bot added size:M 🔌: openai 🤖:improvement labels Jun 12, 2024

added new after examples

Loading
Loading status checks…

2efa31e

tomasonjo approved these changes Jun 15, 2024

View reviewed changes

jordyantunes added 2 commits July 1, 2024 12:25

fix linting errors

Loading
Loading status checks…

f65a7a0

Merge branch 'master' into fix/llmgraphunstructured

Loading
Loading status checks…

4abdb7b

ccurme enabled auto-merge (squash) July 1, 2024 17:23

ccurme approved these changes Jul 1, 2024

View reviewed changes

dosubot bot added the lgtm label Jul 1, 2024

ccurme merged commit a50eabb into langchain-ai:master Jul 1, 2024
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling #22793

experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling #22793

jordyantunes commented Jun 12, 2024 •

edited

Loading

vercel bot commented Jun 12, 2024 •

edited

Loading

baskaryan commented Jun 14, 2024

tomasonjo commented Jun 14, 2024

jordyantunes commented Jun 15, 2024

tomasonjo commented Jun 15, 2024

tomasonjo commented Jun 29, 2024

jordyantunes commented Jul 1, 2024

tomasonjo commented Jul 1, 2024

experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling #22793

experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling #22793

Conversation

jordyantunes commented Jun 12, 2024 • edited Loading

vercel bot commented Jun 12, 2024 • edited Loading

baskaryan commented Jun 14, 2024

tomasonjo commented Jun 14, 2024

jordyantunes commented Jun 15, 2024

tomasonjo commented Jun 15, 2024

tomasonjo commented Jun 29, 2024

jordyantunes commented Jul 1, 2024

tomasonjo commented Jul 1, 2024

jordyantunes commented Jun 12, 2024 •

edited

Loading

vercel bot commented Jun 12, 2024 •

edited

Loading