Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tickets in process of being updated may not be included in daily extract file #50

Open
dav3r opened this issue Nov 17, 2023 · 0 comments
Labels
bug This issue or pull request addresses broken functionality

Comments

@dav3r
Copy link
Member

dav3r commented Nov 17, 2023

馃悰 Summary

While cyhy-data-extract.py is running, tickets that are in the process of being updated in the DB by the CyHy commander may be unintentionally excluded from the tickets query and therefore left out from the daily extract file. This behavior is not desired.

To reproduce

The following is theoretical - we have not replicated iit in a controlled environment, but we believe it happens in Production:
Run cyhy-data-extract.py while the CyHy commander is also running. If the cyhy-data-extract.py tickets query attempts to read a ticket that is being updated by the commander while the extract query is running, the ticket document will be locked by MongoDB and unreadable. That ticket will be excluded from the query results and the daily extract file.

See here for more info about Mongo concurrency limitations.

Expected behavior

Ideally, all tickets that were modified within the specified timeframe of the cyhy-data-extract.py query should be included in the daily extract file.

@mcdonnnj and I think that the easiest way to ensure this behavior is to stop the commander from making any DB updates while cyhy-data-extract.py is performing queries, though there may be other solutions to this problem.

Any helpful log output or screenshots

Here is some annotated log output from feeds.log-20231027.gz showing the ticket that alerted us to this issue:

2023-10-26T08:26:58Z - Ticket 653a30455de7cb1cd6740083 opened (initial vulnscan)
2023-10-26T22:49:22Z - Ticket 653a30455de7cb1cd6740083 verified (2nd vulnscan)
2023-10-27T00:00:00Z - cyhy-feeds script starts up
** START: POTENTIAL WINDOW 1 FOR TICKETS TO BE UPDATED IN DB, BUT NOT INCLUDED IN DAILY EXTRACT **
2023-10-27 00:00:01,994 INFO cyhy-feeds - Beginning data extraction process.
2023-10-27 00:00:02,027 INFO cyhy-feeds - Creating cursors for query results.
2023-10-27 00:00:02,028 INFO cyhy-feeds - Extracting data from database(s).
** END: POTENTIAL WINDOW 1 FOR TICKETS TO BE UPDATED IN DB, BUT NOT INCLUDED IN DAILY EXTRACT **
2023-10-26 00:00:02,124 INFO cyhy-feeds - Fetching from host_scans collection...
2023-10-26 00:00:13,221 INFO cyhy-feeds - Finished writing host_scans to file.
2023-10-26 00:00:26,779 INFO cyhy-feeds - Added host_scans_2023-10-26T000000+0000.json to cyhy_extract_2023-10-26T000000+0000.tbz
2023-10-26 00:00:26,797 INFO cyhy-feeds - Deleted host_scans_2023-10-26T000000+0000.json as part of cleanup.
2023-10-26 00:00:26,797 INFO cyhy-feeds - Fetching from hosts collection...
2023-10-26 00:04:22,116 INFO cyhy-feeds - Finished writing hosts to file.
2023-10-26 00:06:42,793 INFO cyhy-feeds - Added hosts_2023-10-26T000000+0000.json to cyhy_extract_2023-10-26T000000+0000.tbz
2023-10-26 00:06:42,919 INFO cyhy-feeds - Deleted hosts_2023-10-26T000000+0000.json as part of cleanup.
2023-10-26 00:06:42,919 INFO cyhy-feeds - Fetching from kevs collection...
2023-10-26 00:06:42,939 INFO cyhy-feeds - Finished writing kevs to file.
2023-10-26 00:06:42,940 INFO cyhy-feeds - Added kevs_2023-10-26T000000+0000.json to cyhy_extract_2023-10-26T000000+0000.tbz
2023-10-26 00:06:42,940 INFO cyhy-feeds - Deleted kevs_2023-10-26T000000+0000.json as part of cleanup.
2023-10-26 00:06:42,940 INFO cyhy-feeds - Fetching from port_scans collection...
2023-10-26 00:27:51,508 INFO cyhy-feeds - Finished writing port_scans to file.
2023-10-26 00:51:12,861 INFO cyhy-feeds - Added port_scans_2023-10-26T000000+0000.json to cyhy_extract_2023-10-26T000000+0000.tbz
2023-10-26 00:51:14,184 INFO cyhy-feeds - Deleted port_scans_2023-10-26T000000+0000.json as part of cleanup.
2023-10-26 00:51:14,187 INFO cyhy-feeds - Fetching from requests collection...
2023-10-26 00:51:14,707 INFO cyhy-feeds - Finished writing requests to file.
2023-10-26 00:51:15,574 INFO cyhy-feeds - Added requests_2023-10-26T000000+0000.json to cyhy_extract_2023-10-26T000000+0000.tbz
2023-10-26 00:51:15,594 INFO cyhy-feeds - Deleted requests_2023-10-26T000000+0000.json as part of cleanup.
** START: POTENTIAL WINDOW 2 FOR TICKETS TO BE UPDATED IN DB, BUT NOT INCLUDED IN DAILY EXTRACT **
2023-10-26 00:51:15,594 INFO cyhy-feeds - Fetching from tickets collection...
** END: POTENTIAL WINDOW 2 FOR TICKETS TO BE UPDATED IN DB, BUT NOT INCLUDED IN DAILY EXTRACT **
2023-10-26 01:13:11,320 INFO cyhy-feeds - Finished writing tickets to file.
...

Note

#49 partially helps with "POTENTIAL WINDOW 1" above, but not with "POTENTIAL WINDOW 2".

@dav3r dav3r added the bug This issue or pull request addresses broken functionality label Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue or pull request addresses broken functionality
Projects
Status: No status
Development

No branches or pull requests

1 participant