Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community[minor]: S3FileLoader to use expose mode and post_processors arguments of unstructured loader #19270

Merged
merged 7 commits into from
Mar 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
10 changes: 8 additions & 2 deletions libs/community/langchain_community/document_loaders/s3_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import os
import tempfile
from typing import TYPE_CHECKING, List, Optional, Union
from typing import TYPE_CHECKING, Callable, List, Optional, Union

from langchain_community.document_loaders.unstructured import UnstructuredBaseLoader

Expand All @@ -27,6 +27,8 @@ def __init__(
aws_secret_access_key: Optional[str] = None,
aws_session_token: Optional[str] = None,
boto_config: Optional[botocore.client.Config] = None,
mode: str = "single",
post_processors: Optional[List[Callable]] = None,
):
"""Initialize with bucket and key name.
Expand Down Expand Up @@ -82,8 +84,12 @@ def __init__(
object is set on the session, the config object used when creating
the client will be the result of calling ``merge()`` on the
default config with the config provided to this call.
:param mode: Mode in which to read the file. Valid options are: single,
paged and elements
:param post_processors: Post processing functions to be applied to
extracted elements
"""
super().__init__()
super().__init__(mode, post_processors)
self.bucket = bucket
self.key = key
self.region_name = region_name
Expand Down