Python: Multiple results per prompt (incl. streaming) #1316

awharrison-28 · 2023-06-02T17:54:35Z

Motivation and Context

To take advantage of APIs offered by most LLMs and to be in sync with .NET SK, this PR introduces the ability to generate multiple text completions or chat completions from a single prompt.

Description

Return type hint for complete_async and complete_chat_async changed from str -> Union[str, List[str]]. The use of Union is the proper way to indicate multiple return types prior to Python 3.10+. 3.10 supports the use of the | symbol, but since the Python SK is supported on 3.8 and 3.9, I did not adopt the newer standard.
complete_async, complete_stream_async, complete_chat_async, and complete_chat_stream_async now support settings field number_of_responses greater than 1. Previously only a value of 1 was supported.
Note: hf_text_completion does not support streaming multiple responses due to a limitation of TextIteratorStreamer. This feature requires the ability to parse distinct responses from TextIteratorStreamer.
Fixed a bug where complete_async was streaming single responses as 1D arrays, content now is simply a string.

Example Usage

Setup

    kernel = sk.Kernel()

    # Configure OpenAI service
    api_key, org_id = sk.openai_settings_from_dot_env()
    oai_text_service = OpenAITextCompletion("text-davinci-003", api_key, org_id)
    oai_chat_service = OpenAIChatCompletion("gpt-3.5-turbo", api_key, org_id)

    # Configure Hugging Face service
    hf_text_service = HuggingFaceTextCompletion("gpt2", task="text-generation")
  
    # Configure Prompt
    prompt = "what is the purpose of a rubber duck?"

    # Configure Request Settings
    text_request_settings_multi = CompleteRequestSettings(
        max_tokens=100,
        temperature=0.7,
        top_p=1,
        frequency_penalty=0.5,
        presence_penalty=0.8,
        number_of_responses=4
    )

    chat_request_settings_multi = ChatRequestSettings(
        max_tokens=100,
        temperature=0.7,
        top_p=1,
        frequency_penalty=0.5,
        presence_penalty=0.8,
        number_of_responses=4
    )

Text Completion (Standard)

    texts = await oai_text_service.complete_async(prompt, text_request_settings_multi)
    i = 0
    for text in texts:
        print("Option " + str(i) + ": " + text)
        i += 1

Streaming Text Completion

    multi_stream = oai_text_service.complete_stream_async(prompt, text_request_settings_multi)
    texts = [''] * text_request_settings_multi.number_of_responses
    async for text in multi_stream:
        i = 0
        os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience
        print("PROMPT: " + prompt)
        for option in text:
            texts[i] = texts[i] + option
            print("{0}: {1}".format(i, texts[i]))
            i += 1

Chat Completion (Standard)

    texts = await oai_chat_service.complete_chat_async([("user",prompt)], chat_request_settings_multi)
    i = 0
    for text in texts:
        print("Option " + str(i) + ": " + text)
        i += 1

Streaming Chat Completion

    multi_stream = oai_chat_service.complete_chat_stream_async([("user",prompt)], chat_request_settings_multi)
    texts = [''] * chat_request_settings_multi.number_of_responses
    async for text in multi_stream:
        i = 0
        os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience
        print("PROMPT: " + prompt)
        for option in text:
            texts[i] = texts[i] + option
            print("{0}: {1}".format(i, texts[i]))
            i += 1

HuggingFace Standard Completion

    texts = await hf_text_service.complete_async(prompt, request_settings_multi)
    i = 0
    for text in texts:
        print("-----------------------------------")
        print("Option " + str(i) + ": " + text)
        i += 1

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows SK Contribution Guidelines (https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
The code follows the .NET coding conventions (https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions) verified with dotnet format
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

…om/awharrison-28/semantic-kernel into python/multiple_results_per_prompt

* [e4781e5] Add sample notebook to demo weaviate memory store (#1359) * [dae1c16] Python: Added examples of using ChatCompletion models for skill building in Jupyter Notebooks (#1242) * [f4e92eb] fix: Add Azure OpenAI support for python/08-native-function-inline (#1365) * [de74668] Fixing typos (#1377) * [67aa732] Python: Fix weaviate integration tests (#1422) * [f60d7ba] Fix functions_view.py (#1213) * [b2e1548] Python: Multiple results per prompt (incl. streaming) (#1316) * [4c4670a] Using dotenv instead of parsing keys ourselves (#1295) * [05d9e72] Python: Sync pyproject.toml with requirements.txt (#1150) * [6cbea85] Python: Add additional_metadata field to MemoryRecord and address TODOs in ChromaMemoryStore (#1323) * [8947e68] Weaviate: Fix to be compatible with python 3.8 (#1349)

### Motivation and Context To take advantage of APIs offered by most LLMs and to be in sync with .NET SK, this PR introduces the ability to generate multiple text completions or chat completions from a single prompt. ![MultiChatCompletionStreamMulti](https://github.com/microsoft/semantic-kernel/assets/54643756/7bec03ec-0be2-40b0-b938-6ff71beac209) ### Description - Return type hint for `complete_async` and `complete_chat_async` changed from `str -> Union[str, List[str]]`. The use of `Union` is the proper way to indicate multiple return types prior to Python 3.10+. 3.10 supports the use of the `|` symbol, but since the Python SK is supported on 3.8 and 3.9, I did not adopt the newer standard. - `complete_async`, `complete_stream_async`, `complete_chat_async`, and `complete_chat_stream_async` now support settings field `number_of_responses` greater than 1. Previously only a value of 1 was supported. - **Note: hf_text_completion does not support streaming multiple responses due to a limitation of TextIteratorStreamer. This feature requires the ability to parse distinct responses from TextIteratorStreamer.** - Fixed a bug where `complete_async` was streaming single responses as 1D arrays, content now is simply a string. ### Example Usage #### Setup ``` kernel = sk.Kernel() # Configure OpenAI service api_key, org_id = sk.openai_settings_from_dot_env() oai_text_service = OpenAITextCompletion("text-davinci-003", api_key, org_id) oai_chat_service = OpenAIChatCompletion("gpt-3.5-turbo", api_key, org_id) # Configure Hugging Face service hf_text_service = HuggingFaceTextCompletion("gpt2", task="text-generation") # Configure Prompt prompt = "what is the purpose of a rubber duck?" # Configure Request Settings text_request_settings_multi = CompleteRequestSettings( max_tokens=100, temperature=0.7, top_p=1, frequency_penalty=0.5, presence_penalty=0.8, number_of_responses=4 ) chat_request_settings_multi = ChatRequestSettings( max_tokens=100, temperature=0.7, top_p=1, frequency_penalty=0.5, presence_penalty=0.8, number_of_responses=4 ) ``` #### Text Completion (Standard) ``` texts = await oai_text_service.complete_async(prompt, text_request_settings_multi) i = 0 for text in texts: print("Option " + str(i) + ": " + text) i += 1 ``` #### Streaming Text Completion ``` multi_stream = oai_text_service.complete_stream_async(prompt, text_request_settings_multi) texts = [''] * text_request_settings_multi.number_of_responses async for text in multi_stream: i = 0 os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience print("PROMPT: " + prompt) for option in text: texts[i] = texts[i] + option print("{0}: {1}".format(i, texts[i])) i += 1 ``` #### Chat Completion (Standard) ``` texts = await oai_chat_service.complete_chat_async([("user",prompt)], chat_request_settings_multi) i = 0 for text in texts: print("Option " + str(i) + ": " + text) i += 1 ``` #### Streaming Chat Completion ``` multi_stream = oai_chat_service.complete_chat_stream_async([("user",prompt)], chat_request_settings_multi) texts = [''] * chat_request_settings_multi.number_of_responses async for text in multi_stream: i = 0 os.system('cls' if os.name == 'nt' else 'clear') # clear the screen for a better experience print("PROMPT: " + prompt) for option in text: texts[i] = texts[i] + option print("{0}: {1}".format(i, texts[i])) i += 1 ``` #### HuggingFace Standard Completion ``` texts = await hf_text_service.complete_async(prompt, request_settings_multi) i = 0 for text in texts: print("-----------------------------------") print("Option " + str(i) + ": " + text) i += 1 ```

* [e4781e5] Add sample notebook to demo weaviate memory store (microsoft#1359) * [dae1c16] Python: Added examples of using ChatCompletion models for skill building in Jupyter Notebooks (microsoft#1242) * [f4e92eb] fix: Add Azure OpenAI support for python/08-native-function-inline (microsoft#1365) * [de74668] Fixing typos (microsoft#1377) * [67aa732] Python: Fix weaviate integration tests (microsoft#1422) * [f60d7ba] Fix functions_view.py (microsoft#1213) * [b2e1548] Python: Multiple results per prompt (incl. streaming) (microsoft#1316) * [4c4670a] Using dotenv instead of parsing keys ourselves (microsoft#1295) * [05d9e72] Python: Sync pyproject.toml with requirements.txt (microsoft#1150) * [6cbea85] Python: Add additional_metadata field to MemoryRecord and address TODOs in ChromaMemoryStore (microsoft#1323) * [8947e68] Weaviate: Fix to be compatible with python 3.8 (microsoft#1349)

awharrison-28 and others added 6 commits May 30, 2023 11:54

WIP

ac5c94e

Merge branch 'microsoft:main' into python/multiple_results_per_prompt

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

ba5becc

multi chat stream done

aef4b62

streaming multi-choice complete

daea4ae

fixed linting errors and made do_sample default for HF

afeb855

all tests passing locally

487b837

awharrison-28 requested review from dluc, mkarle and alexchaomander June 2, 2023 17:54

github-actions bot added the python label Jun 2, 2023

awharrison-28 and others added 5 commits June 2, 2023 10:54

Merge branch 'main' into python/multiple_results_per_prompt

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

493e400

removed commented code block

638f287

Merge branch 'python/multiple_results_per_prompt' of https://github.c…

960fbb9

…om/awharrison-28/semantic-kernel into python/multiple_results_per_prompt

Merge branch 'main' into python/multiple_results_per_prompt

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

ffee94f

Merge branch 'main' into python/multiple_results_per_prompt

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

c6eb81b

mkarle approved these changes Jun 8, 2023

View reviewed changes

dluc merged commit b2e1548 into microsoft:main Jun 9, 2023

dluc mentioned this pull request Jun 12, 2023

New python release #1425

Merged

joowon-dm-snu mentioned this pull request Jun 20, 2023

Python: Integrate chat stream into kernel #1606

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Multiple results per prompt (incl. streaming) #1316

Python: Multiple results per prompt (incl. streaming) #1316

awharrison-28 commented Jun 2, 2023 •

edited

Loading

Python: Multiple results per prompt (incl. streaming) #1316

Python: Multiple results per prompt (incl. streaming) #1316

Conversation

awharrison-28 commented Jun 2, 2023 • edited Loading

Motivation and Context

Description

Example Usage

Setup

Text Completion (Standard)

Streaming Text Completion

Chat Completion (Standard)

Streaming Chat Completion

HuggingFace Standard Completion

Contribution Checklist

awharrison-28 commented Jun 2, 2023 •

edited

Loading