Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Multiple results per prompt (incl. streaming) #1316

Merged
merged 11 commits into from
Jun 9, 2023

Conversation

awharrison-28
Copy link
Contributor

@awharrison-28 awharrison-28 commented Jun 2, 2023

Motivation and Context

To take advantage of APIs offered by most LLMs and to be in sync with .NET SK, this PR introduces the ability to generate multiple text completions or chat completions from a single prompt.

MultiChatCompletionStreamMulti

Description

  • Return type hint for complete_async and complete_chat_async changed from str -> Union[str, List[str]]. The use of Union is the proper way to indicate multiple return types prior to Python 3.10+. 3.10 supports the use of the | symbol, but since the Python SK is supported on 3.8 and 3.9, I did not adopt the newer standard.
  • complete_async, complete_stream_async, complete_chat_async, and complete_chat_stream_async now support settings field number_of_responses greater than 1. Previously only a value of 1 was supported.
  • Note: hf_text_completion does not support streaming multiple responses due to a limitation of TextIteratorStreamer. This feature requires the ability to parse distinct responses from TextIteratorStreamer.
  • Fixed a bug where complete_async was streaming single responses as 1D arrays, content now is simply a string.

Example Usage

Setup

    kernel = sk.Kernel()

    # Configure OpenAI service
    api_key, org_id = sk.openai_settings_from_dot_env()
    oai_text_service = OpenAITextCompletion("text-davinci-003", api_key, org_id)
    oai_chat_service = OpenAIChatCompletion("gpt-3.5-turbo", api_key, org_id)

    # Configure Hugging Face service
    hf_text_service = HuggingFaceTextCompletion("gpt2", task="text-generation")
  
    # Configure Prompt
    prompt = "what is the purpose of a rubber duck?"

    # Configure Request Settings
    text_request_settings_multi = CompleteRequestSettings(
        max_tokens=100,
        temperature=0.7,
        top_p=1,
        frequency_penalty=0.5,
        presence_penalty=0.8,
        number_of_responses=4
    )

    chat_request_settings_multi = ChatRequestSettings(
        max_tokens=100,
        temperature=0.7,
        top_p=1,
        frequency_penalty=0.5,
        presence_penalty=0.8,
        number_of_responses=4
    )

Text Completion (Standard)

    texts = await oai_text_service.complete_async(prompt, text_request_settings_multi)
    i = 0
    for text in texts:
        print("Option " + str(i) + ": " + text)
        i += 1

Streaming Text Completion

    multi_stream = oai_text_service.complete_stream_async(prompt, text_request_settings_multi)
    texts = [''] * text_request_settings_multi.number_of_responses
    async for text in multi_stream:
        i = 0
        os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience
        print("PROMPT: " + prompt)
        for option in text:
            texts[i] = texts[i] + option
            print("{0}: {1}".format(i, texts[i]))
            i += 1

Chat Completion (Standard)

    texts = await oai_chat_service.complete_chat_async([("user",prompt)], chat_request_settings_multi)
    i = 0
    for text in texts:
        print("Option " + str(i) + ": " + text)
        i += 1

Streaming Chat Completion

    multi_stream = oai_chat_service.complete_chat_stream_async([("user",prompt)], chat_request_settings_multi)
    texts = [''] * chat_request_settings_multi.number_of_responses
    async for text in multi_stream:
        i = 0
        os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience
        print("PROMPT: " + prompt)
        for option in text:
            texts[i] = texts[i] + option
            print("{0}: {1}".format(i, texts[i]))
            i += 1

HuggingFace Standard Completion

    texts = await hf_text_service.complete_async(prompt, request_settings_multi)
    i = 0
    for text in texts:
        print("-----------------------------------")
        print("Option " + str(i) + ": " + text)
        i += 1

Contribution Checklist

Sorry, something went wrong.

awharrison-28 and others added 6 commits May 30, 2023 11:54
@github-actions github-actions bot added the python Pull requests for the Python Semantic Kernel label Jun 2, 2023
awharrison-28 and others added 5 commits June 2, 2023 10:54

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@dluc dluc merged commit b2e1548 into microsoft:main Jun 9, 2023
@dluc dluc mentioned this pull request Jun 12, 2023
dluc added a commit that referenced this pull request Jun 12, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
* [e4781e5] Add sample notebook to demo weaviate memory store (#1359)
* [dae1c16] Python: Added examples of using ChatCompletion models for
skill building in Jupyter Notebooks (#1242)
* [f4e92eb] fix: Add Azure OpenAI support for
python/08-native-function-inline (#1365)
* [de74668] Fixing typos (#1377)
* [67aa732] Python: Fix weaviate integration tests (#1422)
* [f60d7ba] Fix functions_view.py (#1213)
* [b2e1548] Python: Multiple results per prompt (incl. streaming)
(#1316)
* [4c4670a] Using dotenv instead of parsing keys ourselves (#1295)
* [05d9e72] Python: Sync pyproject.toml with requirements.txt (#1150)
* [6cbea85] Python: Add additional_metadata field to MemoryRecord and
address TODOs in ChromaMemoryStore (#1323)
* [8947e68] Weaviate: Fix to be compatible with python 3.8 (#1349)
shawncal pushed a commit to shawncal/semantic-kernel that referenced this pull request Jul 6, 2023
### Motivation and Context
To take advantage of APIs offered by most LLMs and to be in sync with
.NET SK, this PR introduces the ability to generate multiple text
completions or chat completions from a single prompt.


![MultiChatCompletionStreamMulti](https://github.com/microsoft/semantic-kernel/assets/54643756/7bec03ec-0be2-40b0-b938-6ff71beac209)


### Description
- Return type hint for `complete_async` and `complete_chat_async`
changed from `str -> Union[str, List[str]]`. The use of `Union` is the
proper way to indicate multiple return types prior to Python 3.10+. 3.10
supports the use of the `|` symbol, but since the Python SK is supported
on 3.8 and 3.9, I did not adopt the newer standard.
- `complete_async`, `complete_stream_async`, `complete_chat_async`, and
`complete_chat_stream_async` now support settings field
`number_of_responses` greater than 1. Previously only a value of 1 was
supported.
- **Note: hf_text_completion does not support streaming multiple
responses due to a limitation of TextIteratorStreamer. This feature
requires the ability to parse distinct responses from
TextIteratorStreamer.**
- Fixed a bug where `complete_async` was streaming single responses as
1D arrays, content now is simply a string.

### Example Usage
#### Setup
```
    kernel = sk.Kernel()

    # Configure OpenAI service
    api_key, org_id = sk.openai_settings_from_dot_env()
    oai_text_service = OpenAITextCompletion("text-davinci-003", api_key, org_id)
    oai_chat_service = OpenAIChatCompletion("gpt-3.5-turbo", api_key, org_id)

    # Configure Hugging Face service
    hf_text_service = HuggingFaceTextCompletion("gpt2", task="text-generation")
  
    # Configure Prompt
    prompt = "what is the purpose of a rubber duck?"

    # Configure Request Settings
    text_request_settings_multi = CompleteRequestSettings(
        max_tokens=100,
        temperature=0.7,
        top_p=1,
        frequency_penalty=0.5,
        presence_penalty=0.8,
        number_of_responses=4
    )

    chat_request_settings_multi = ChatRequestSettings(
        max_tokens=100,
        temperature=0.7,
        top_p=1,
        frequency_penalty=0.5,
        presence_penalty=0.8,
        number_of_responses=4
    )
```

#### Text Completion (Standard)
```
    texts = await oai_text_service.complete_async(prompt, text_request_settings_multi)
    i = 0
    for text in texts:
        print("Option " + str(i) + ": " + text)
        i += 1
```

#### Streaming Text Completion
```
    multi_stream = oai_text_service.complete_stream_async(prompt, text_request_settings_multi)
    texts = [''] * text_request_settings_multi.number_of_responses
    async for text in multi_stream:
        i = 0
        os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience
        print("PROMPT: " + prompt)
        for option in text:
            texts[i] = texts[i] + option
            print("{0}: {1}".format(i, texts[i]))
            i += 1
```

#### Chat Completion (Standard)
```
    texts = await oai_chat_service.complete_chat_async([("user",prompt)], chat_request_settings_multi)
    i = 0
    for text in texts:
        print("Option " + str(i) + ": " + text)
        i += 1
```

#### Streaming Chat Completion
```
    multi_stream = oai_chat_service.complete_chat_stream_async([("user",prompt)], chat_request_settings_multi)
    texts = [''] * chat_request_settings_multi.number_of_responses
    async for text in multi_stream:
        i = 0
        os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience
        print("PROMPT: " + prompt)
        for option in text:
            texts[i] = texts[i] + option
            print("{0}: {1}".format(i, texts[i]))
            i += 1
```

#### HuggingFace Standard Completion
```
    texts = await hf_text_service.complete_async(prompt, request_settings_multi)
    i = 0
    for text in texts:
        print("-----------------------------------")
        print("Option " + str(i) + ": " + text)
        i += 1
```
shawncal pushed a commit to shawncal/semantic-kernel that referenced this pull request Jul 6, 2023
* [e4781e5] Add sample notebook to demo weaviate memory store (microsoft#1359)
* [dae1c16] Python: Added examples of using ChatCompletion models for
skill building in Jupyter Notebooks (microsoft#1242)
* [f4e92eb] fix: Add Azure OpenAI support for
python/08-native-function-inline (microsoft#1365)
* [de74668] Fixing typos (microsoft#1377)
* [67aa732] Python: Fix weaviate integration tests (microsoft#1422)
* [f60d7ba] Fix functions_view.py (microsoft#1213)
* [b2e1548] Python: Multiple results per prompt (incl. streaming)
(microsoft#1316)
* [4c4670a] Using dotenv instead of parsing keys ourselves (microsoft#1295)
* [05d9e72] Python: Sync pyproject.toml with requirements.txt (microsoft#1150)
* [6cbea85] Python: Add additional_metadata field to MemoryRecord and
address TODOs in ChromaMemoryStore (microsoft#1323)
* [8947e68] Weaviate: Fix to be compatible with python 3.8 (microsoft#1349)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

3 participants