Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SARIF: Please add support for fingerprints in SARIF format 🐾 #98

Closed
jamacku opened this issue Dec 21, 2022 · 5 comments · Fixed by #168
Closed

SARIF: Please add support for fingerprints in SARIF format 🐾 #98

jamacku opened this issue Dec 21, 2022 · 5 comments · Fixed by #168
Assignees

Comments

@jamacku
Copy link
Member

jamacku commented Dec 21, 2022

Fingerprints could be beneficial when used with GitHub and for a general comparison of scan results.

Documentation of fingerprints property: link

GitHub SARIF Documentation: link

@kdudka
Copy link
Member

kdudka commented Mar 31, 2023

My initial thought was to compute a hash from all the data that are matched in DefLookup::lookup(). The problem is that DefLookup is able to count occurrences of defects with equal fields (which would result in identical hashes) whereas GitHub would see all of them as a single defect. This would not work well when a new instance of defect with previously seen fingerprint was introduced in an update.

@jamacku
Copy link
Member Author

jamacku commented Mar 31, 2023

Here is scala example of creating partialFingerprints from codacy/codacy-analysis-cli:

partialFingerprints = SarifReport.PartialFingerprints("1", generatePrimaryLocationHash(issue, fileContents)))

// ...

final case class PartialFingerprints(primaryLocationStartColumnFingerprint: String, primaryLocationLineHash: String)

It seems they are counting only with line numbers and with the hash of line content.

@kdudka
Copy link
Member

kdudka commented Apr 3, 2023

@jamacku The actual fingerprint computation is here: https://github.com/codacy/codacy-analysis-cli/blob/80c8baecbfea050feb5cee184051b996cdd7fed3/cli/src/main/scala/com/codacy/analysis/cli/formatter/Sarif.scala#L192

They compute MD5 hash of a string concatenated from fileName, patternId, and lineContentsWithoutSpaces.

@jamacku
Copy link
Member Author

jamacku commented Apr 3, 2023

Here is an example of how fingerprints are handled by github/codeql-action - code

And also one comment how it works:

@kdudka
Copy link
Member

kdudka commented Apr 3, 2023

codeql-action seems to hash context lines with their own Ad-Hoc implemented hash function: https://github.com/github/codeql-action/blob/bb28e7e59e2ad6c1e5400e671795b2fa1b2fca6f/lib/fingerprints.js#L49

@jamacku jamacku added the sarif label Aug 16, 2023
@kdudka kdudka self-assigned this Mar 29, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Mar 29, 2024
It hashes the data that csdiff uses in its matching algorithm.
The interfaces are already prepared for csdiff/v1, which will
also take the line content into account when available.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

Related: https://issues.redhat.com/browse/OSH-9
Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Mar 29, 2024
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Mar 29, 2024
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: csutils#98
Closes: csutils#168
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
It hashes the data that csdiff uses in its matching algorithm.
The interfaces are already prepared for csdiff/v1, which will
also take the line content into account when available.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

Related: https://issues.redhat.com/browse/OSH-9
Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
... when the line content in the `csgrep --embed-context` format
is available.

Out of the 1783 fingerprints generated for the csgrep regression tests
we got 115 collisions, which need to be analyzed.  Some of them look
undesired as, for example:
```
Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:256: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 254|             release_header (hdr);
 # 255|             hdr->name = (void *)name;
 # 256|->           hdr->value = (void *)value;
 # 257|             hdr->release_policy = release_policy;
 # 258|             return;

Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:271: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 269|     hdr = &req->headers[req->hcount++];
 # 270|     hdr->name = (void *)name;
 # 271|->   hdr->value = (void *)value;
 # 272|     hdr->release_policy = release_policy;
 # 273|   }
```

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
No change in behavior intended with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
No change in behavior intended with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 23, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 23, 2024
No change in behavior intended with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 23, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 23, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 23, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue Apr 26, 2024
kdudka pushed a commit to kdudka/csdiff that referenced this issue Apr 26, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
It hashes the data that csdiff uses in its matching algorithm.
The interfaces are already prepared for csdiff/v1, which will
also take the line content into account when available.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

Related: https://issues.redhat.com/browse/OSH-9
Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
... when the line content in the `csgrep --embed-context` format
is available.

Out of the 1783 fingerprints generated for the csgrep regression tests
we got 115 collisions, which need to be analyzed.  Some of them look
undesired as, for example:
```
Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:256: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 254|             release_header (hdr);
 # 255|             hdr->name = (void *)name;
 # 256|->           hdr->value = (void *)value;
 # 257|             hdr->release_policy = release_policy;
 # 258|             return;

Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:271: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 269|     hdr = &req->headers[req->hcount++];
 # 270|     hdr->name = (void *)name;
 # 271|->   hdr->value = (void *)value;
 # 272|     hdr->release_policy = release_policy;
 # 273|   }
```

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
No change in behavior intended with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue May 2, 2024
kdudka pushed a commit to kdudka/csdiff that referenced this issue May 2, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
It hashes the data that csdiff uses in its matching algorithm.
The interfaces are already prepared for csdiff/v1, which will
also take the line content into account when available.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

Related: https://issues.redhat.com/browse/OSH-9
Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
... when the line content in the `csgrep --embed-context` format
is available.

Out of the 1783 fingerprints generated for the csgrep regression tests
we got 115 collisions, which need to be analyzed.  Some of them look
undesired as, for example:
```
Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:256: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 254|             release_header (hdr);
 # 255|             hdr->name = (void *)name;
 # 256|->           hdr->value = (void *)value;
 # 257|             hdr->release_policy = release_policy;
 # 258|             return;

Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:271: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 269|     hdr = &req->headers[req->hcount++];
 # 270|     hdr->name = (void *)name;
 # 271|->   hdr->value = (void *)value;
 # 272|     hdr->release_policy = release_policy;
 # 273|   }
```

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
No change in behavior intended with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
kdudka pushed a commit to kdudka/csdiff that referenced this issue May 3, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
kdudka pushed a commit to kdudka/csdiff that referenced this issue May 3, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
kdudka added a commit to kdudka/csdiff that referenced this issue May 3, 2024
We use boost-1.69 in our EPEL-7 builds.

Related: csutils#98
Closes: csutils#168
@kdudka kdudka closed this as completed in dbd78b8 May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants