Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: webrecorder/browsertrix
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.14.3
Choose a base ref
...
head repository: webrecorder/browsertrix
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v1.14.4
Choose a head ref
  • 8 commits
  • 33 files changed
  • 10 contributors

Commits on Mar 7, 2025

  1. Fix nightly tests (#2460)

    Fixes #2459 
    
    - Set `/data/` as primary storage `access_endpoint_url` in nightly test
    chart
    - Modify nightly test GH Actions workflow to spawn a separate job per
    nightly test module using dynamic matrix
    - Set configuration not to fail other jobs if one job fails
    - Modify failing tests:
    - Add fixture to background job nightly test module so it can run alone
    - Add retry loop to crawlconfig stats nightly test so it's less
    dependent on timing
    
    GitHub limits each workflow to 256 jobs, so this should continue to be
    able to scale up for us without issue.
    
    ---------
    
    Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
    tw4l and ikreymer authored Mar 7, 2025

    Verified

    This commit was signed with the committer’s verified signature. The key has expired.
    brainrake Márton Boros
    Copy the full SHA
    13bf818 View commit details
  2. Add thumbnail endpoint (#2468)

    - Add /thumbnail collections endpoint to serve the thumbnail as an image for public
    collections.
    - Also fix uploading thumbnail images to use correct mime, if available.
    ikreymer authored Mar 7, 2025
    Copy the full SHA
    6c192df View commit details
  3. set default crawler channel if not set, possible fix for #2458 (#2469)

    update default RWP version
    ikreymer authored Mar 7, 2025
    Copy the full SHA
    03fa00d View commit details
  4. fix: Open and highlight correct workflow form section on tab click (#…

    …2463)
    
    Fixes #2461
    
    ## Changes
    
    Opens workflow form section when clicking on section navigation link,
    fixing issue with scroll position impacting unopened panels.
    SuaYoo authored Mar 7, 2025
    Copy the full SHA
    fa05d68 View commit details
  5. Add missing "payment never made" subscription status to superadmin or…

    …g list (#2457)
    emma-sg authored Mar 7, 2025
    Copy the full SHA
    8078f38 View commit details
  6. Translations update from Hosted Weblate (#2467) (#2471)

    Translations update from [Hosted Weblate](https://hosted.weblate.org)
    for
    
    [Browsertrix/Browsertrix](https://hosted.weblate.org/projects/browsertrix/browsertrix/).
    
    
    
    Current translation status:
    
    ![Weblate translation
    
    status](https://hosted.weblate.org/widget/browsertrix/browsertrix/horizontal-auto.svg)
    
    ---------
    
    Co-authored-by: Weblate (bot) <hosted@weblate.org>
    Co-authored-by: Anne Paz <anelisespaz@gmail.com>
    Co-authored-by: weblate <1607653+weblate@users.noreply.github.com>
    4 people authored Mar 7, 2025
    Copy the full SHA
    75eb04c View commit details

Commits on Mar 8, 2025

  1. docs: add public collections gallery howto (#2462)

    - Updated how collections gallery and presentation and sharing pages
    - Collections gallery page content extracted from blog post, linked from blog post
    - Each page has one video covering the gallery setting and individual collection presentation
    - Cleaned up text on both to avoid duplicated content (thanks @DaleLore)
    
    
    
    ---------
    
    Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
    Co-authored-by: DaleLore <DaleLoreNY@gmail.com>
    3 people authored Mar 8, 2025
    Copy the full SHA
    00a4251 View commit details
  2. version: bump to 1.14.4

    ikreymer committed Mar 8, 2025
    Copy the full SHA
    d8365c7 View commit details
Showing with 954 additions and 179 deletions.
  1. +17 −1 .github/workflows/k3d-nightly-ci.yaml
  2. +48 −0 backend/btrixcloud/colls.py
  3. +1 −1 backend/btrixcloud/operator/crawls.py
  4. +5 −1 backend/btrixcloud/storages.py
  5. +1 −1 backend/btrixcloud/version.py
  6. +64 −0 backend/test_nightly/conftest.py
  7. +21 −13 backend/test_nightly/test_crawlconfig_crawl_stats.py
  8. +10 −6 backend/test_nightly/test_z_background_jobs.py
  9. +1 −1 chart/Chart.yaml
  10. +3 −2 chart/test/test-nightly-addons.yaml
  11. +3 −3 chart/values.yaml
  12. +3 −0 frontend/docs/docs/overrides/.icons/bootstrap/asterisk.svg
  13. +3 −0 frontend/docs/docs/overrides/.icons/bootstrap/globe2.svg
  14. +3 −0 frontend/docs/docs/overrides/.icons/bootstrap/three-dots-vertical.svg
  15. +26 −3 frontend/docs/docs/user-guide/collection.md
  16. +58 −13 frontend/docs/docs/user-guide/presentation-sharing.md
  17. +77 −0 frontend/docs/docs/user-guide/public-collections-gallery.md
  18. +1 −0 frontend/docs/mkdocs.yml
  19. +1 −1 frontend/package.json
  20. +51 −25 frontend/src/__generated__/locales/de.ts
  21. +42 −16 frontend/src/__generated__/locales/es.ts
  22. +28 −2 frontend/src/__generated__/locales/fr.ts
  23. +54 −27 frontend/src/__generated__/locales/pt.ts
  24. +11 −0 frontend/src/components/orgs-list.ts
  25. +10 −0 frontend/src/features/admin/stats.ts
  26. +66 −35 frontend/src/features/crawl-workflows/workflow-editor.ts
  27. +1 −0 frontend/src/types/billing.ts
  28. +84 −6 frontend/xliff/de.xlf
  29. +84 −6 frontend/xliff/es.xlf
  30. +84 −7 frontend/xliff/fr.xlf
  31. +88 −8 frontend/xliff/pt.xlf
  32. +1 −1 version.txt
  33. +4 −0 yarn.lock
18 changes: 17 additions & 1 deletion .github/workflows/k3d-nightly-ci.yaml
Original file line number Diff line number Diff line change
@@ -8,8 +8,24 @@ on:
workflow_dispatch:

jobs:
collect-test-modules:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v3
- id: set-matrix
run: |
echo matrix="$(ls ./backend/test_nightly/ | grep -o "^test_.*" | jq -R -s -c 'split("\n")[:-1]')" >> $GITHUB_OUTPUT
btrix-k3d-nightly-test:
name: ${{ matrix.module }}
needs: collect-test-modules
runs-on: ubuntu-latest
strategy:
matrix:
module: ${{fromJSON(needs.collect-test-modules.outputs.matrix)}}
fail-fast: false
steps:
- name: Create k3d Cluster
uses: AbsaOSS/k3d-action@v2
@@ -82,7 +98,7 @@ jobs:
run: kubectl exec -i deployment/local-minio -c minio -- mkdir /data/replica-0

- name: Run Tests
run: pytest -vv ./backend/test_nightly/test_*.py
run: pytest -vv ./backend/test_nightly/${{ matrix.module }}

- name: Print Backend Logs (API)
if: ${{ failure() }}
48 changes: 48 additions & 0 deletions backend/btrixcloud/colls.py
Original file line number Diff line number Diff line change
@@ -11,6 +11,7 @@

import asyncio
import pymongo
import aiohttp
from pymongo.collation import Collation
from fastapi import Depends, HTTPException, Response
from fastapi.responses import StreamingResponse
@@ -407,6 +408,34 @@ async def get_public_collection_out(

return PublicCollOut.from_dict(result)

async def get_public_thumbnail(
self, slug: str, org: Organization
) -> StreamingResponse:
"""return thumbnail of public collection, if any"""
result = await self.get_collection_raw_by_slug(
slug, public_or_unlisted_only=True
)

thumbnail = result.get("thumbnail")
if not thumbnail:
raise HTTPException(status_code=404, detail="thumbnail_not_found")

image_file = ImageFile(**thumbnail)
image_file_out = await image_file.get_public_image_file_out(
org, self.storage_ops
)

path = self.storage_ops.resolve_internal_access_path(image_file_out.path)

async def reader():
async with aiohttp.ClientSession() as session:
async with session.get(path) as resp:
async for chunk in resp.content.iter_chunked(4096):
yield chunk

headers = {"Cache-Control": "max-age=3600, stale-while-revalidate=86400"}
return StreamingResponse(reader(), media_type=image_file.mime, headers=headers)

async def list_collections(
self,
org: Organization,
@@ -852,6 +881,7 @@ async def stream_iter():
file_prep.upload_name,
stream_iter(),
MIN_UPLOAD_PART_SIZE,
mime=file_prep.mime,
):
print("Collection thumbnail stream upload failed", flush=True)
raise HTTPException(status_code=400, detail="upload_failed")
@@ -1175,6 +1205,24 @@ async def download_public_collection(

return await colls.download_collection(coll.id, org)

@app.get(
"/public/orgs/{org_slug}/collections/{coll_slug}/thumbnail",
tags=["collections", "public"],
response_class=StreamingResponse,
)
async def get_public_thumbnail(
org_slug: str,
coll_slug: str,
):
try:
org = await colls.orgs.get_org_by_slug(org_slug)
# pylint: disable=broad-exception-caught
except Exception:
# pylint: disable=raise-missing-from
raise HTTPException(status_code=404, detail="collection_not_found")

return await colls.get_public_thumbnail(coll_slug, org)

@app.post(
"/orgs/{oid}/collections/{coll_id}/home-url",
tags=["collections"],
2 changes: 1 addition & 1 deletion backend/btrixcloud/operator/crawls.py
Original file line number Diff line number Diff line change
@@ -153,7 +153,7 @@ async def sync_crawls(self, data: MCSyncData):
oid=oid,
org=org,
storage=StorageRef(spec["storageName"]),
crawler_channel=spec.get("crawlerChannel"),
crawler_channel=spec.get("crawlerChannel", "default"),
proxy_id=spec.get("proxyId"),
scale=spec.get("scale", 1),
started=data.parent["metadata"]["creationTimestamp"],
6 changes: 5 additions & 1 deletion backend/btrixcloud/storages.py
Original file line number Diff line number Diff line change
@@ -382,6 +382,7 @@ async def do_upload_multipart(
filename: str,
file_: AsyncIterator,
min_size: int,
mime: Optional[str] = None,
) -> bool:
"""do upload to specified key using multipart chunking"""
s3storage = self.get_org_primary_storage(org)
@@ -405,7 +406,10 @@ async def get_next_chunk(file_, min_size) -> bytes:
key += filename

mup_resp = await client.create_multipart_upload(
ACL="bucket-owner-full-control", Bucket=bucket, Key=key
ACL="bucket-owner-full-control",
Bucket=bucket,
Key=key,
ContentType=mime or "",
)

upload_id = mup_resp["UploadId"]
2 changes: 1 addition & 1 deletion backend/btrixcloud/version.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""current version"""

__version__ = "1.14.3"
__version__ = "1.14.4"
64 changes: 64 additions & 0 deletions backend/test_nightly/conftest.py
Original file line number Diff line number Diff line change
@@ -320,3 +320,67 @@ def org_with_quotas(admin_auth_headers):
data = r.json()

return data["id"]


@pytest.fixture(scope="session")
def deleted_crawl_id(admin_auth_headers, default_org_id):
# Start crawl.
crawl_data = {
"runNow": True,
"name": "Test crawl",
"config": {
"seeds": [{"url": "https://webrecorder.net/"}],
"limit": 1,
},
}
r = requests.post(
f"{API_PREFIX}/orgs/{default_org_id}/crawlconfigs/",
headers=admin_auth_headers,
json=crawl_data,
)
data = r.json()

crawl_id = data["run_now_job"]

# Wait for it to complete
while True:
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/crawls/{crawl_id}/replay.json",
headers=admin_auth_headers,
)
data = r.json()
if data["state"] == "complete":
break
time.sleep(5)

# Wait until replica background job completes
while True:
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/jobs/?jobType=create-replica&success=True",
headers=admin_auth_headers,
)
assert r.status_code == 200
if r.json()["total"] == 1:
break
time.sleep(5)

# Delete crawl
r = requests.post(
f"{API_PREFIX}/orgs/{default_org_id}/crawls/delete",
headers=admin_auth_headers,
json={"crawl_ids": [crawl_id]},
)
assert r.status_code == 200

# Wait until delete replica background job completes
while True:
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/jobs/?jobType=delete-replica&success=True",
headers=admin_auth_headers,
)
assert r.status_code == 200
if r.json()["total"] == 1:
break
time.sleep(5)

return crawl_id
34 changes: 21 additions & 13 deletions backend/test_nightly/test_crawlconfig_crawl_stats.py
Original file line number Diff line number Diff line change
@@ -71,17 +71,25 @@ def test_crawlconfig_crawl_stats(admin_auth_headers, default_org_id, crawl_confi
data = r.json()
assert data["deleted"]

time.sleep(10)

# Verify crawl stats from /crawlconfigs
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/crawlconfigs/{crawl_config_id}",
headers=admin_auth_headers,
)
assert r.status_code == 200
data = r.json()
assert data["crawlAttemptCount"] == 2
assert data["crawlCount"] == 0
assert not data["lastCrawlId"]
assert not data["lastCrawlState"]
assert not data["lastCrawlTime"]
max_attempts = 18
attempts = 1
while True:
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/crawlconfigs/{crawl_config_id}",
headers=admin_auth_headers,
)
assert r.status_code == 200
data = r.json()

if data["crawlAttemptCount"] == 2 and data["crawlCount"] == 0:
assert not data["lastCrawlId"]
assert not data["lastCrawlState"]
assert not data["lastCrawlTime"]
break

if attempts >= max_attempts:
assert False

time.sleep(10)
attempts += 1
16 changes: 10 additions & 6 deletions backend/test_nightly/test_z_background_jobs.py
Original file line number Diff line number Diff line change
@@ -10,7 +10,7 @@
job_id = None


def test_background_jobs_list(admin_auth_headers, default_org_id):
def test_background_jobs_list(admin_auth_headers, default_org_id, deleted_crawl_id):
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/jobs/", headers=admin_auth_headers
)
@@ -37,7 +37,7 @@ def test_background_jobs_list(admin_auth_headers, default_org_id):

@pytest.mark.parametrize("job_type", [("create-replica"), ("delete-replica")])
def test_background_jobs_list_filter_by_type(
admin_auth_headers, default_org_id, job_type
admin_auth_headers, default_org_id, deleted_crawl_id, job_type
):
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/jobs/?jobType={job_type}",
@@ -54,7 +54,9 @@ def test_background_jobs_list_filter_by_type(
assert item["type"] == job_type


def test_background_jobs_list_filter_by_success(admin_auth_headers, default_org_id):
def test_background_jobs_list_filter_by_success(
admin_auth_headers, default_org_id, deleted_crawl_id
):
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/jobs/?success=True",
headers=admin_auth_headers,
@@ -70,7 +72,9 @@ def test_background_jobs_list_filter_by_success(admin_auth_headers, default_org_
assert item["success"]


def test_background_jobs_no_failures(admin_auth_headers, default_org_id):
def test_background_jobs_no_failures(
admin_auth_headers, default_org_id, deleted_crawl_id
):
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/jobs/?success=False",
headers=admin_auth_headers,
@@ -81,7 +85,7 @@ def test_background_jobs_no_failures(admin_auth_headers, default_org_id):
assert data["total"] == 0


def test_get_background_job(admin_auth_headers, default_org_id):
def test_get_background_job(admin_auth_headers, default_org_id, deleted_crawl_id):
r = requests.get(
f"{API_PREFIX}/orgs/{default_org_id}/jobs/{job_id}", headers=admin_auth_headers
)
@@ -100,7 +104,7 @@ def test_get_background_job(admin_auth_headers, default_org_id):
assert data["replica_storage"]


def test_retry_all_failed_bg_jobs_not_superuser(crawler_auth_headers):
def test_retry_all_failed_bg_jobs_not_superuser(crawler_auth_headers, deleted_crawl_id):
r = requests.post(
f"{API_PREFIX}/orgs/all/jobs/retryFailed", headers=crawler_auth_headers
)
2 changes: 1 addition & 1 deletion chart/Chart.yaml
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@ type: application
icon: https://webrecorder.net/assets/icon.png

# Browsertrix and Chart Version
version: v1.14.3
version: v1.14.4

dependencies:
- name: btrix-admin-logging
5 changes: 3 additions & 2 deletions chart/test/test-nightly-addons.yaml
Original file line number Diff line number Diff line change
@@ -21,7 +21,8 @@ storages:
bucket_name: *local_bucket_name

endpoint_url: "http://local-minio.default:9000/"
is_default_primary: True
is_default_primary: true
access_endpoint_url: "/data/"

- name: "replica-0"
type: "s3"
@@ -30,6 +31,6 @@ storages:
bucket_name: "replica-0"

endpoint_url: "http://local-minio.default:9000/"
is_default_replica: True
is_default_replica: true


6 changes: 3 additions & 3 deletions chart/values.yaml
Original file line number Diff line number Diff line change
@@ -83,7 +83,7 @@ allow_dupe_invites: "0"
invite_expire_seconds: 604800

# base url for replayweb.page
rwp_base_url: "https://cdn.jsdelivr.net/npm/replaywebpage@2.2.4/"
rwp_base_url: "https://cdn.jsdelivr.net/npm/replaywebpage@2.3.3/"

superuser:
# set this to enable a superuser admin
@@ -103,7 +103,7 @@ replica_deletion_delay_days: 0

# API Image
# =========================================
backend_image: "docker.io/webrecorder/browsertrix-backend:1.14.3"
backend_image: "docker.io/webrecorder/browsertrix-backend:1.14.4"
backend_pull_policy: "Always"

backend_password_secret: "PASSWORD!"
@@ -161,7 +161,7 @@ backend_avg_memory_threshold: 95

# Nginx Image
# =========================================
frontend_image: "docker.io/webrecorder/browsertrix-frontend:1.14.3"
frontend_image: "docker.io/webrecorder/browsertrix-frontend:1.14.4"
frontend_pull_policy: "Always"

frontend_cpu: "10m"
3 changes: 3 additions & 0 deletions frontend/docs/docs/overrides/.icons/bootstrap/asterisk.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions frontend/docs/docs/overrides/.icons/bootstrap/globe2.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading