Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support regular re-enumeration of affected versions for existing records #2017

Open
andrewpollock opened this issue Feb 27, 2024 · 1 comment
Labels
data quality Issues with data quality enhancement New feature or request

Comments

@andrewpollock
Copy link
Contributor

Problem statement:

Today, affected[].versions enumeration only occurs during the import of an OSV record.

#1987 has identified that it is conceivable that additional vulnerable versions may be released (for example, if the vulnerability was fixed in a backward-incompatible manner in a new major version branch) after the OSV record has been published (and imported by OSV.dev).

This means that it is possible for the OSV.dev API to return false negatives for new vulnerable versions released after the OSV record has been published and imported.

False negatives detract from OSV.dev's strategy to be a comprehensive, accurate and timely database of known vulnerabilities.

Proposed solution:

Periodically (interval TBD), reimport all of the records for a given source, causing the affected versions for each record to be re-enumerated, based on the facts available at that point in time.

How this reimport is triggered will vary between the different currently supported data sources:

GCS: Set ignore_last_import_time to true for the given source record in SourceRepository in Datastore
Git: Set last_synced_hash to null for the given source record in SourceRepository in Datastore
REST: Set ignore_last_import_time to true for the given source record in SourceRepository in Datastore

@andrewpollock
Copy link
Contributor Author

Latest evolved thinking:

Do the moral equivalent of reimporting a daily rolling window of records, based on last modification time being greater than an age TBD.

Some of the definitional work that will happen as part of #2186 will influence the TBD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data quality Issues with data quality enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant