Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChangeStream stops working after some time #13607

Closed
vkarpov15 opened this issue Jul 12, 2023 Discussed in #13602 · 1 comment · Fixed by #13736
Closed

ChangeStream stops working after some time #13607

vkarpov15 opened this issue Jul 12, 2023 Discussed in #13602 · 1 comment · Fixed by #13736
Labels
confirmed-bug We've confirmed this is a bug in Mongoose and will fix it.
Milestone

Comments

@vkarpov15
Copy link
Collaborator

Discussed in #13602

Originally posted by thaoula July 11, 2023
Hi All,

We are trying to utilise ChangeStreams using Mongoose and have encountered some challenges with reliability.

We have the following issues:

  • For some unknown reason, our DB level ChangeStream which is connected to an Atlas Cluster stops receiving changes. We do not get any errors even though we are using ChangeStream.on('error') listener. Nothing is logged.

  • When this happens, restarting the container but keeping the resume token i.e we are setting resumeAfter does not work. We only get changes after deleting the resume token and restarting the container.

Before logging some Github issues I wanted to see if anyone has experienced problems like this and whether you have some tips.

What Works

  1. We are able to successfully open a ChangeStream at the database level using connection.watch().
  2. We can receive the change event.
  3. We can store the resume token after each change event is called.
  4. We can restart the application and resumeAfter the previous resume token.

For some unknown reason (maybe low activity due to weekend or overnight, or primary changes), the change stream stops receiving change events. I am not sure how to debug this issue as no event listeners are called on the change stream. ie. error, resumeTokenChanged, closed events are never ever triggered.

By chance after reviewing the mongoose code and noticing the returned changestream was a wrapped object, I decided to just use connection.getClient().watch which returns a mongdb native ChangeStream and I am able to successfully listen to resumeTokenChanged, closed, change and error events as I would have expected when using connection.watch().

When using this approach, I suprisingly found that the resumeTokenChanged event is called almost once per second and the token is changing. Also, the token returned by this event is also much shorter than the resume token that is provided in the change event.

I also noticed that if i stop the database (test failure) the change stream is closed (it calls the close event listener) and it is able to resume when i restart the database server.

My questions are -

  1. Is connection.watch Mongoose Changestream object meant to output all the events?
  2. Is it safe to use connection.getClient().watch to get a native change stream?
  3. Why is resumeTokenChanged called every second, should I persist the token provided in this event?
  4. In general, how can I detect if the change stream stopped working. I am wondering if the resumeTokenChanged event can be used as a heartbeat and it it stops for a period of time .. i should reconnect the changestream.

Sorry for the all the questions.
Regards,
Tarek

@vkarpov15 vkarpov15 added the needs repro script Maybe a bug, but no repro script. The issue reporter should create a script that demos the issue label Jul 12, 2023
@vkarpov15 vkarpov15 added this to the 7.4.1 milestone Jul 12, 2023
@vkarpov15 vkarpov15 modified the milestones: 7.4.1, 7.4.2 Jul 23, 2023
@vkarpov15 vkarpov15 modified the milestones: 7.4.2, 7.4.4 Aug 3, 2023
@vkarpov15
Copy link
Collaborator Author

  1. Yes. We put in a fix in fix(cursor): bubble up resumeTokenChanged event from change streams #13736 for resumeTokenChanged event
  2. Yes. Mongoose's ChangeStream wrapper is just a convenience wrapper for buffering purposes.
  3. You should persist the token provided, although the MongoDB Node driver does still emit this event even if the token hasn't changed. It looks like the MongoDB node driver emits 'resumeTokenChanged' every time it calls MongoDB to get more changes. It still emits 'resumeTokenChanged', even when the resume token is the same as the last event.
  4. The change stream should be fine unless you receive an 'error' or 'close' event. For example, if the entire replica set shuts down in the middle of your change stream, you'll receive a 'close' event; although it can take up to a minute for that event to fire. For example, the following script:
'use strict';

const mongoose = require('mongoose');

void async function main() {
  const uri = 'mongodb://127.0.0.1:27017,127.0.0.1:27018/mongoose_test';
  await mongoose.connect(uri);
  const Person = mongoose.model('Person', mongoose.Schema({ name: String }));
  await Person.createCollection();

  // Create a change stream. The 'change' event gets emitted when there's a
  // change in the database
  const changeStream = await Person.watch();
  
  changeStream.on('change', data => console.log(new Date(), data));
  changeStream.on('resumeTokenChanged', data => console.log(new Date(), 'resumeTokenChanged', data));
  changeStream.on('error', data => console.log(new Date(), 'error', data));
  changeStream.on('close', data => console.log(new Date(), 'close', data));

  await new Promise(resolve => setTimeout(resolve, 1000));
  // Insert a doc, will trigger the change stream handler above
  console.log(new Date(), 'Inserting doc');
  await Person.create({ name: 'Axl Rose' });
  console.log(new Date(), 'Inserted doc');
}();

Prints the following output if I kill the entire replica set after a few seconds:

$ node gh-13607.js 
2023-08-14T19:55:13.537Z resumeTokenChanged { _data: '8264DA86A1000000022B0229296E04' }
2023-08-14T19:55:14.532Z Inserting doc
2023-08-14T19:55:14.545Z resumeTokenChanged { _data: '8264DA86A1000000022B0229296E04' }
2023-08-14T19:55:14.554Z Inserted doc
2023-08-14T19:55:14.554Z resumeTokenChanged {
  _data: '8264DA86A2000000012B022C0100296E5A1004B95EF18957CB4D9FA8B2D826162B6EA146645F6964006464DA86A257C9D18FDDBA9AE30004'
}
2023-08-14T19:55:14.555Z {
  _id: {
    _data: '8264DA86A2000000012B022C0100296E5A1004B95EF18957CB4D9FA8B2D826162B6EA146645F6964006464DA86A257C9D18FDDBA9AE30004'
  },
  operationType: 'insert',
  clusterTime: new Timestamp({ t: 1692042914, i: 1 }),
  fullDocument: {
    _id: new ObjectId("64da86a257c9d18fddba9ae3"),
    name: 'Axl Rose',
    __v: 0
  },
  ns: { db: 'mongoose_test', coll: 'people' },
  documentKey: { _id: new ObjectId("64da86a257c9d18fddba9ae3") }
}
2023-08-14T19:55:15.556Z resumeTokenChanged {
  _data: '8264DA86A2000000012B022C0100296E5A1004B95EF18957CB4D9FA8B2D826162B6EA146645F6964006464DA86A257C9D18FDDBA9AE30004'
}
2023-08-14T19:55:16.559Z resumeTokenChanged {
  _data: '8264DA86A2000000012B022C0100296E5A1004B95EF18957CB4D9FA8B2D826162B6EA146645F6964006464DA86A257C9D18FDDBA9AE30004'
}
2023-08-14T19:55:17.562Z resumeTokenChanged {
  _data: '8264DA86A2000000012B022C0100296E5A1004B95EF18957CB4D9FA8B2D826162B6EA146645F6964006464DA86A257C9D18FDDBA9AE30004'
}
2023-08-14T19:55:18.566Z resumeTokenChanged {
  _data: '8264DA86A2000000012B022C0100296E5A1004B95EF18957CB4D9FA8B2D826162B6EA146645F6964006464DA86A257C9D18FDDBA9AE30004'
}
2023-08-14T19:55:19.568Z resumeTokenChanged {
  _data: '8264DA86A2000000012B022C0100296E5A1004B95EF18957CB4D9FA8B2D826162B6EA146645F6964006464DA86A257C9D18FDDBA9AE30004'
}
2023-08-14T19:56:19.571Z close undefined
2023-08-14T19:56:19.571Z close undefined
2023-08-14T19:56:19.572Z close undefined
^C

@vkarpov15 vkarpov15 added confirmed-bug We've confirmed this is a bug in Mongoose and will fix it. and removed needs repro script Maybe a bug, but no repro script. The issue reporter should create a script that demos the issue labels Aug 14, 2023
vkarpov15 added a commit that referenced this issue Aug 15, 2023
fix(cursor): bubble up resumeTokenChanged event from change streams
@vkarpov15 vkarpov15 reopened this Aug 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug We've confirmed this is a bug in Mongoose and will fix it.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant