Instructor's `response_model` doesn't typecheck #260

gkossakowski · 2023-12-07T22:27:20Z

Describe the bug
Instructor's patch of openai's completions.create doesn't typecheck when used with response_model.

To Reproduce
Use the sample code from the README and open a project in VSCode with pylance installed.

Expected behavior
The code type-checks.

Screenshots

The text was updated successfully, but these errors were encountered:

gkossakowski · 2023-12-07T22:29:07Z

The error is so basic that I wonder if instructor doesn't support typechecking. On the other hand, it does advertise IDE completion support thanks to being based on Pydantic.

jxnl · 2023-12-07T22:32:14Z

The error is so basic that I wonder if instructor doesn't support typechecking. On the other hand, it does advertise IDE completion support thanks to being based on Pydantic.

yeah usually I just do user: UserDetail =

or set the return type of a function def f() -> UserDetail and use the type of the function.

I should try harder to see if generics work better. Will work on this

gkossakowski · 2023-12-07T22:35:29Z

I don't think the culprit is with the inference of the type of user variable. The problem is with the type of create method not having response_model as the argument.

jxnl · 2023-12-07T22:40:26Z

ah got it, yeah i do a patch instead of subclassing the entire client, do you have any suggestions? I should probs spend some time on this when i have time.

gkossakowski · 2023-12-07T22:55:40Z

I'm not an expert on Python typing but I'd start fixing this issue:

You see that the inferred type from patch is a union of sync and async client. You don't want a union but you want the type to be dependent on streaming argument which determines which of the two variants in the union should be picked.

For expressing the extension of create method, I asked ChatGPT: https://chat.openai.com/share/6a69c8e7-29de-4b56-9ce2-211dc046fa5c

Based on my type-checking experience from my previous life, ChatGPT's idea check out at first look.

gkossakowski · 2023-12-07T22:56:39Z

By the way, are you not using type-checking yourself?

jxnl · 2023-12-08T05:58:29Z

i dont use the strictest mode, this is super helpful I'll see if someone can help out with this

gao-hongnan · 2023-12-09T05:48:56Z

i dont use the strictest mode, this is super helpful I'll see if someone can help out with this

Not an expert in typing theory as well (need to revise on concepts such as covariant, contravariant). But I do attempt to adopt slightly strict static type checks in python based projects.

I take reference of the strictness from OpenAI, and PyTorch and use mypy as the static type checker.

I can raise a PR to implement pre-commit hooks (or make/scripts) for local static type checking, and continuous-integration checks in the containerized environment in GitHub Actions (both local and github actions should refer to the same set of mypy configurations unless there's a strong reason to deviate). I think the mypy configuration can adopt a gradual implementation where one can start applying the type check on 1-2 core files, then slowly open up to the rest of the code base.

Sample mypy command (with config-file pointing to .mypy.ini):

$ mypy --no-pretty \
     --color-output \
     --strict \
     --python-version=3.9 \
     --config-file=.mypy.ini \
     instructor/patch.py

yields the following sample snippet of the result of using mypy on strict mode (command above):

instructor/patch.py:53: error: Unsupported left operand type for + ("None")  [operator]
instructor/patch.py:53: error: No overload variant of "__add__" of "list" matches argument type "str"  [operator]
instructor/patch.py:53: note: Possible overload variants:
instructor/patch.py:53: note:     def __add__(self, list[Union[ChatCompletionContentPartTextParam, ChatCompletionContentPartImageParam]], /) -> list[Union[ChatCompletionContentPartTextParam, ChatCompletionContentPartImageParam]]
instructor/patch.py:53: note:     def [_S] __add__(self, list[_S], /) -> list[Union[_S, ChatCompletionContentPartTextParam, ChatCompletionContentPartImageParam]]
instructor/patch.py:53: note: Left operand is of type "Union[str, None, list[Union[ChatCompletionContentPartTextParam, ChatCompletionContentPartImageParam]]]"

barapa · 2024-01-11T21:35:22Z

For the sync client, I have written a simple wrapper class around the patched client that allows the rest of my code to avoid type issues.

from typing import TypeVar, Generic, Type
import instructor

from openai import OpenAI

# Define a generic type for the response model
T = TypeVar("T")


class StructuredOpenAI(Generic[T]):
    """A wrapper class around instructor's patched OpenAI client.

    This simple wrapper allows us to avoid type errors when using Instructor's patched OpenAI client.

    """

    def __init__(self, openai_client: OpenAI):
        self._client = instructor.patch(openai_client)

    def create(
        self,
        response_model: Type[T],
        max_retries: int = 1,
        validation_context=None,
        *args,
        **kwargs,
    ) -> T:
        return self._client.chat.completions.create(  # type: ignore
            response_model=response_model,
            validation_context=validation_context,
            max_retries=max_retries,
            *args,
            **kwargs,
        )

gkossakowski · 2024-01-11T21:45:28Z

How do you use this class? Doesn't it make sense to patch the upstream, i.e. instructor itself with similar code structure?

jxnl · 2024-01-12T01:33:04Z

i don't hate it.

savarin · 2024-01-31T19:00:33Z

Hi everyone! I'm new and thought it would be interesting to contribute by adding types (I've only had one open source contribution and it's adding types to urllib3).

Reviewing the dependency graph I thought it would make sense to start with instructor.exceptions and instructor.function_calls.

The remaining errors I'm seeing relate to BaseModel having type Any and function_calls.OpenAISchema.openai_schema being both a property and a class method.

> mypy --python-version=3.9 --color-output --no-pretty --follow-imports=skip instructor/cli/cli.py instructor/cli/usage.py instructor/exceptions.py instructor/function_calls.py
instructor/function_calls.py:36: error: Class cannot subclass "BaseModel" (has type "Any")  [misc]
instructor/function_calls.py:77: error: Only instance methods can be decorated with @property  [misc]
instructor/function_calls.py:158: error: Value of type "Callable[[], dict[str, Any]]" is not indexable  [index]
instructor/function_calls.py:171: error: Value of type "Callable[[], dict[str, Any]]" is not indexable  [index]
instructor/function_calls.py:223: error: Value of type "Callable[[], dict[str, Any]]" is not indexable  [index]
instructor/function_calls.py:236: error: Value of type "Callable[[], dict[str, Any]]" is not indexable  [index]

I have a draft PR here that I'm still working on, just wondering more broadly if these changes are helpful and if not, is there another area that makes more sense to add types to? Do we even want to add types? Thanks!

gkossakowski · 2024-01-31T19:06:18Z

Hi and welcome! If your changes would make the original error I reported go away, it would be hugely helpful.

savarin · 2024-01-31T19:27:00Z

Well instructor.patch looks like a couple more levels down, I do appreciate the welcome 😉.

Taking a closer look with ChatGPT it does seem the double decorator provides the intended behavior when the openai_schema is used a decorator. Fascinating.

I'll leave this open for @jxnl to opine, I'm not as familiar with the test coverage to know if something would break somewhere else. Adding type ignores seems like the way forward here, though it does feel like cheating (for now).

With @classmethod @property:

When you use both @classmethod and @property on openai_schema, it seems to behave like a class-level property. This is not standard Python behavior, but it appears to be working due to the interaction with the @openai_schema decorator. In this case, Dataframe.openai_schema directly gives you the dictionary, which is why type(Dataframe.openai_schema) returns dict.

...

Decorator Interaction:

The @openai_schema decorator you're applying to the Dataframe class is likely what's enabling this unusual behavior. Decorators in Python can be very powerful and can modify class behavior in non-standard ways. It seems that this decorator is handling the combination of @classmethod and @property in a specific manner to achieve the desired functionality.

Given this understanding, it appears that in your specific case, using both @classmethod and @property is necessary to achieve the desired behavior, due to the way the @openai_schema decorator is implemented. This is a unique situation and not typical of standard Python class behavior. It's important to note that this kind of usage might be confusing to other Python developers who expect @property and @classmethod to have their standard meanings.

savarin · 2024-01-31T19:43:38Z

Ach so #372. Generics FTW

jxnl · 2024-01-31T21:39:43Z

@savarin happy to take contribs into types

quickpanda · 2024-02-01T09:54:21Z

Reviewing the dependency graph I thought it would make sense to start with instructor.exceptions and instructor.function_calls.

Hi, just saw this thread. While this might be slightly tangential to the main issue, I'm interested in how you created this dependency graph. Could you share the method or tools you used for this?
@savarin

savarin · 2024-02-01T14:41:23Z

@quickpanda Sure I used pydeps (pip install pydeps, then pydeps instructor).

savarin · 2024-02-03T10:06:28Z

New types umbrella issue here #390

jxnl closed this as completed Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instructor's `response_model` doesn't typecheck #260

Instructor's `response_model` doesn't typecheck #260

gkossakowski commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

jxnl commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

jxnl commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

jxnl commented Dec 8, 2023

gao-hongnan commented Dec 9, 2023 •

edited

barapa commented Jan 11, 2024

gkossakowski commented Jan 11, 2024

jxnl commented Jan 12, 2024

savarin commented Jan 31, 2024

gkossakowski commented Jan 31, 2024 •

edited

savarin commented Jan 31, 2024 •

edited

savarin commented Jan 31, 2024 •

edited

jxnl commented Jan 31, 2024

quickpanda commented Feb 1, 2024

savarin commented Feb 1, 2024

savarin commented Feb 3, 2024

Instructor's response_model doesn't typecheck #260

Instructor's response_model doesn't typecheck #260

Comments

gkossakowski commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

jxnl commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

jxnl commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

gkossakowski commented Dec 7, 2023

jxnl commented Dec 8, 2023

gao-hongnan commented Dec 9, 2023 • edited

barapa commented Jan 11, 2024

gkossakowski commented Jan 11, 2024

jxnl commented Jan 12, 2024

savarin commented Jan 31, 2024

gkossakowski commented Jan 31, 2024 • edited

savarin commented Jan 31, 2024 • edited

savarin commented Jan 31, 2024 • edited

jxnl commented Jan 31, 2024

quickpanda commented Feb 1, 2024

savarin commented Feb 1, 2024

savarin commented Feb 3, 2024

Instructor's `response_model` doesn't typecheck #260

Instructor's `response_model` doesn't typecheck #260

gao-hongnan commented Dec 9, 2023 •

edited

gkossakowski commented Jan 31, 2024 •

edited

savarin commented Jan 31, 2024 •

edited

savarin commented Jan 31, 2024 •

edited