Skip to content

Add support to auto-index Bazel builds with an aspect #610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 13, 2023
Merged

Conversation

olafurpg
Copy link
Member

@olafurpg olafurpg commented Jul 7, 2023

Previously, the Bazel example required monkeypatching the java_library rule in order to inject the SemanticDB compiler plugin. This PR is an exploration to replace monkeypatching with aspects, which is a Bazel feature that seems made for our exact use-case. The goal of this PR is to make it possible to auto-index Bazel codebases with as little changes as possible to the existing build configuration in WORKSPACE and BUILD files.

With this change, it should be possible to index most Java/Bazel codebases with the following command

scip-java index --bazel-scip-java-binary=$(which scip-java)

Test plan

  • See new bazel_aspect CI job that indexes scip-java itself and the example workspace
  • Green CI.

@olafurpg
Copy link
Member Author

olafurpg commented Jul 7, 2023

When I currently run the aspect I'm getting this error here

❯ bazel build //src/main/java/example  --aspects aspects/semanticdb.bzl%semanticdb_aspect --output_groups=semdb --define=execroot=$(realpath bazel-$(basename $(pwd))) --define=semdb_path=$(cs fetch com.sourcegraph:semanticdb-javac:0.8.22 --classpath) --nojava_header_compilation

INFO: Analyzed target //src/main/java/example:example (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /Users/olafurpg/dev/sourcegraph/scip-java/examples/bazel-example/src/main/java/example/BUILD:13:13: Building src/main/java/example/example-with-semdb.jar (1 source file) failed: (Exit 1): java failed: error executing command external/remotejdk11_macos_aarch64/bin/java -XX:-CompactStrings '--add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED' ... (remaining 17 arguments skipped)
src/main/java/example/Example.java:2: error: package com.google.common.util.concurrent does not exist
import com.google.common.util.concurrent.Futures;
                                        ^
src/main/java/example/Example.java:6: error: cannot find symbol
		System.out.println(Futures.immediateCancelledFuture());
		                   ^
  symbol:   variable Futures
  location: class Example
Aspect //aspects:semanticdb.bzl%semanticdb_aspect of //src/main/java/example:example failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.130s, Critical Path: 0.06s
INFO: 2 processes: 2 internal.

My guess is that we're not correctly propagating the jvm_import dependencies resulting in the Guava dependency to be missing from the classpath. I will continue investigating next week.

@olafurpg olafurpg force-pushed the olafurpg/aspect branch 4 times, most recently from 25b03f6 to 259062d Compare July 12, 2023 12:26
@olafurpg olafurpg marked this pull request as ready for review July 12, 2023 12:27
@olafurpg
Copy link
Member Author

I tried to reproduce the bazel_aspect issue inside a Linux container but I got the indexing working. Need more investigation why the CI is failing here 🤔 I'm wondering if it's related to bazelbuild/bazel#12485

I briefly investigated the Maven test failures and confirmed it's a regression from the main branch, but I don't know what's causing the regression. It's weird that only Maven is affected, the Gradle tests are passing. I turned on verbose logs with -X but it didn't reveal anything obvious causing the failure

olafurpg added 7 commits July 13, 2023 11:21
Previously, the Bazel example required monkeypatching the `java_library`
rule in order to inject the SemanticDB compiler plugin. This PR fixes
this problem by adding support to automatically index Bazel codebases
without modifying the build configuration (WORKSPACE or BUILD files).

See scip_java.bzl in the diff for more details about how this works.

To make this functionality work, this commit extends the functionality
of the SCIP build tool to optionally allow users to specify a custom
classpath, processorpath, and javac options.
@olafurpg olafurpg requested a review from keynmol July 13, 2023 09:21
val ConfigFileName = "lsif-java.json"
val ConfigFileNames = List("scip-java.json", "lsif-java.json")
val isIgnoredAnnotationProcessor = Set(
"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is confusing - let's replace it with isEmpty check at use site instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed "", it wasn't needed.

case None =>
1
case Some(aspectLabel) =>
val buildCommand = ListBuffer.empty[String]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it's more idiomatic to use List.newBuilder[String] if you don't access the full accumulated result before you finish appending to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in https://github.com/sourcegraph/scip-java/pull/610/files#r1262341973 I'm not sure this has to be a mutable thing at all, seems like appending one by one is not strictly necessary in this application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List.newBuilder[String] is exactly the same thing as ListBuffer at runtime, and ListBuffer is used in the equivalent parts for Gradle/Maven.

"environment variable JAVA_HOME is not set. " +
"To fix this problem run `export JAVA_HOME=/path/to/java` and run scip-java index again."
)
return 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire new logic feels very much like it's written in Java, with Scala syntax.

Would be better to

  1. Remove the builder, replacing it with just a regular list, as I think all the places where you append to it are trivially replaced with
  2. May be using a beefed up assert method that throws (and logs the error through index.app.error)? There's several early exists in this function that IMO doesn't have to be there

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this is written like Java/Go. I also agree the code in the method is messy.

I've refactored the code to improve clarity but I kept a few return 1 statements instead of adding deeply nested if/else expressions.

I also managed to replace the list builder for buildCommand with a single-expression List.

}

/**
* Callback to `os.ProcessOutput.Readlines` that processes the stderr output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and the various tree walking operations - is there a reason we can't just bring in os-lib? Seems like the cli has plenty of dependencies as it is, and it'd be a lot more readable in cases like this to just use os.walk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, Moped already used os-lib :D

I'm confused about the use of file visitors then

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love os-lib in general but I'm not a fan os os.Path, which is required to use os.walk. I've never felt the need to use os.walk anyways because Files.walkFileTree works very reliably IME, even if it's a bit verbose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, the equivalent code (I believe) is

        val bazelOutLink = Files.readSymbolicLink(bazelOut)
        val out = os.Path(index.finalOutput)

        os.walk(
            os.Path(bazelOutLink),
            skip = path => scipPattern.matches(path.toNIO)
          )
          .foreach { p =>
            os.write.append(out, os.read.stream(p))
          }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old habits die hard. I'll admit that looks very nice, esp. knowing how it uses generators under the hood

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried running it and hit on

java.io.IOException: Is a directory
	at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)

because there are many directories with the .scip extension, and os.walk visits both files and directories. It's possible to skip = path => os.isFile(path) && ... but that requires an extra fsstat on every visited file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently there's os.walk.attrs, which gives you StatInfo with path: https://sourcegraph.com/github.com/com-lihaoyi/os-lib/-/blob/os/src/ListOps.scala?L193

@olafurpg olafurpg requested a review from jhchabran July 13, 2023 11:09
@olafurpg olafurpg changed the title Implement an aspect to build a codebase with SemanticDB Add support to auto-index Bazel builds with an aspect Jul 13, 2023
Copy link
Member Author

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @keynmol for the review, lots of great feedback and I think the resulting code is much clearer now.

case None =>
1
case Some(aspectLabel) =>
val buildCommand = ListBuffer.empty[String]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List.newBuilder[String] is exactly the same thing as ListBuffer at runtime, and ListBuffer is used in the equivalent parts for Gradle/Maven.

"environment variable JAVA_HOME is not set. " +
"To fix this problem run `export JAVA_HOME=/path/to/java` and run scip-java index again."
)
return 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this is written like Java/Go. I also agree the code in the method is messy.

I've refactored the code to improve clarity but I kept a few return 1 statements instead of adding deeply nested if/else expressions.

I also managed to replace the list builder for buildCommand with a single-expression List.

}

/**
* Callback to `os.ProcessOutput.Readlines` that processes the stderr output
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love os-lib in general but I'm not a fan os os.Path, which is required to use os.walk. I've never felt the need to use os.walk anyways because Files.walkFileTree works very reliably IME, even if it's a bit verbose.

val ConfigFileName = "lsif-java.json"
val ConfigFileNames = List("scip-java.json", "lsif-java.json")
val isIgnoredAnnotationProcessor = Set(
"",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed "", it wasn't needed.

@olafurpg
Copy link
Member Author

FYI @sluongng Automatically indexing Bazel/Java codebases via aspects. It's still not fully hermetic but it still manages to index many Java codebases that I've tried it against (big improvement over requiring non-trivial manual configuration!). The aspect could become fully hermetic with a few more improvements that I don't have time to work on right now.

@olafurpg olafurpg enabled auto-merge July 13, 2023 14:09
Copy link
Contributor

@keynmol keynmol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited for this!

Can't comment much on how idiomatic/correct the starlark script is, but seems fine for what it does.

Very very nice!

@olafurpg olafurpg merged commit 9dbad49 into main Jul 13, 2023
@olafurpg olafurpg deleted the olafurpg/aspect branch July 13, 2023 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants