Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix loop_stop() not clearing ._thread when called from same thread #809

Merged
merged 3 commits into from
Mar 27, 2024

Conversation

jiachin1995
Copy link
Contributor

When client.loop_stop() is called from within on_message() callback, client ._thread will not be set to None. This will cause subsequent loop_start() to return MQTT_ERR_INVAL error.

When loop_stop is called from within on_message(), ._thread will not be set to None. This will cause subsequent loop_start() to return an error.
@jiachin1995
Copy link
Contributor Author

jiachin1995 commented Feb 2, 2024

Context:
Our use case was for PahoMQTT to simulate an IoT device. The device can be remotely rebooted/shutdown.
If we try to call loop_stop() from within the PahoMQTT thread. It stops the thread but subsequent loop_start() stops working as client._thread is not None.

For shutdown, this pull request is necessary to address the issue directly without creating a workaround.

For restart, we have to call reboot from a different thread. For example:

# mydevice.reboot
def reboot(self):
    self.paho_client.loop_stop()
    time.sleep(5)
    self.paho_client.loop_start()

# on_message callback
def on_message(client):
    . . .
    # restart cmd received 
    from threading import Thread

    Thread(target=mydevice.reboot).start()

@PierreF
Copy link
Contributor

PierreF commented Feb 4, 2024

Hello, thanks for your contribution. For a contribution to be merged, you will need to sign the Eclipse ECA for your contribution to be merged. (https://accounts.eclipse.org/user/eca)

That being said, if I understand correctly your use-case, you do not want to (just) call loop_stop/loop_start. This will mostly do nothing, it do not disconnect from MQTT (so it don't simulate a device reboot).
loop_stop only stop the network processing thread, and if you call loop_start quickly, you will only produce a short delay in network processing.

I'm not sure if could be valid to call loop_stop & loop_start within a callback. Because to call loop_start I believe it's required to have the thread stopped (or else it will be too complex regarding possible race-condition), and the thread won't stop until the callback returned.

Actually if you want to simulate de device reboot, you should also discard the Client state (like the possibly queued message), for that the best solution would be to re-create the Client(). For this the best solution IMO is doing something like:


def on_message(...):
   if should_simulate_device_reboot():
      # calling disconnect() will *cleanly* disconnect - so it don't simulate a device lost, but device clean reboot
      # but calling disconnect() will stop the loop_forever()
      client.disconnect() 

while simulator_should_run:
   # we re-create the mqtt.Client() each time loop_forever exit
   mqttc = mqtt.Client([...])
   mqttc.connect[...]
   mqttc.loop_forever()

Does this solve your problem ?

@jiachin1995
Copy link
Contributor Author

jiachin1995 commented Feb 5, 2024

Hi Pierre, I have signed the ECA.

For restart, I have already assessed and agree that there is no meaningful way to implement it in PahoMQTT's on_message() callback. I simply wanted to provide some context for the error we faced. We are fine with creating a separate thread.

For shutdown, I believe it should be valid for loop_stop() to be called from on_message() callback. But there is currently an error without this pull request.

@PierreF
Copy link
Contributor

PierreF commented Mar 24, 2024

Thank for your patience. I do have fear that your solution will introduce new bugs, mostly because the thread will still be running with self._thread == None. This might break some assumption in current code.

If I get correctly the aim of this fix (being able to call loop_stop() from on_message() and later calling loop_start()), shouldn't we prefer to set self._thread = None in a try/finally of _thread_main() ? This should fix the issue AND ensure the thread isn't running with self._thread == None ?

This should also make the "later" to call loop_start() possible to determine. It's when loop_start() don't fail with MQTT_ERR_INVAL. With your proposition I don't see a way to be sure that isn't two threads running.

@jiachin1995
Copy link
Contributor Author

Hi Pierre. I have updated the code as suggested. However, I am unsure where to insert a try/finally block.

Do let me know if any other changes are required

src/paho/mqtt/client.py Outdated Show resolved Hide resolved
@PierreF PierreF merged commit 28aa2e6 into eclipse:master Mar 27, 2024
10 checks passed
@PierreF PierreF added this to the 2.1.0 milestone Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants