-
-
Notifications
You must be signed in to change notification settings - Fork 32.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle backup errors more consistently #133522
Conversation
Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
Hey there @home-assistant/supervisor, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
Hey there @home-assistant/cloud, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
@@ -485,7 +493,7 @@ async def test_async_initiate_backup_with_agent_error( | |||
assert result["event"] == { | |||
"manager_state": BackupManagerState.CREATE_BACKUP, | |||
"stage": None, | |||
"state": CreateBackupState.COMPLETED, | |||
"state": CreateBackupState.FAILED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the backup will be marked as failed now if there are agent errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that's fine. Maybe @piitaya can confirm it works for frontend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got confirmation from @bramkragten this doesn't break frontend 👍
I recommend reviewing with hide whitespace. |
8d71443
to
6153d6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, but some comments and questions.
I've not reviewed the tests yet.
if invalid_agents := [ | ||
agent_id for agent_id in agent_ids if agent_id not in self.backup_agents | ||
]: | ||
raise BackupManagerError(f"Invalid agents selected: {invalid_agents}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement of the error logging 👍
self._async_finish_backup(agent_ids, with_automatic_settings), | ||
name="backup_manager_finish_backup", | ||
) | ||
if not raise_task_error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is confusingly named, by the name of the parameter it seems like it's about raising or not raising when something goes wrong, but it's about logging errors?
Also, why do we need to pass this down all the way here, why don't we just log the error in async_create_backup
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to log if we raise. This is about handling exceptions in the backup task when it is not awaited. If we don't await it we need to take care of the exception that may be raised and log it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we now pass this down through multiple layers, and all it does is log, there's no other difference in how the task is handled as far as I understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't about logging. It's about handling exceptions. If we don't await a task and it raises an exception it will trigger the warning about unhandled exception. When we don't await the task we need to add the task done callback and call task.exception
. This will avoid the unhandled exception warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree with this. It should be up to the caller to await tasks and decide if it is or is not interested in errors, there should not be a parameter passed to the called code.
We can merge the PR as is though and improve this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is up to the caller to pick one of the methods depending on what the caller wants.
if with_automatic_settings: | ||
self._update_issue_after_agent_upload(agent_errors) | ||
# delete old backups more numerous than copies | ||
# try this regardless of agent errors above | ||
await delete_backups_exceeding_configured_count(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we introduce a race by issuing the COMPLETED event before we're really done with all operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not idle, so I think it's ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so only after the idle event has been fired, everything is done? I think that's fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes until idle the manager is busy. Completed means the backup is complete. Deleting old backups is a separate operation from the backup. So everything makes sense like it is now.
This comment was marked as spam.
This comment was marked as spam.
6153d6f
to
0f53259
Compare
* Add backup manager and read writer errors * Clean up not needed default argument * Clean up todo comment * Trap agent bugs during upload * Always release stream * Clean up leftover * Update test for backup with automatic settings * Fix use of vol.Any * Refactor test helper * Only update successful timestamp if completed event is sent * Always delete surplus copies * Fix after rebase * Fix after rebase * Revert "Fix use of vol.Any" This reverts commit 28fd7a5. * Inherit BackupReaderWriterError in IncorrectPasswordError --------- Co-authored-by: Erik Montnemery <erik@montnemery.com>
Breaking change
Proposed change
BackupManagerError
.BackupReaderWriter
should now always raiseBackupReaderWriterError
.Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
.To help with the load of incoming pull requests: