Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal panic (conflicting position information on upstream object) with latest rules_go #149

Open
kpaulisse opened this issue Dec 5, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@kpaulisse
Copy link

After updating to the latest rules_go and version of nilaway I am now getting the following panic:

<path redacted>/create.go:1:1: error: INTERNAL PANIC: conflicting position information on upstream object go.mongodb.org/mongo-driver/mongo.Cursor.M7: existing: external/org_mongodb_go_mongo_driver/mongo/cursor.go:292:18, got: external/org_mongodb_go_mongo_driver/mongo/cursor.go:292:1
goroutine 76 [running]:
runtime/debug.Stack()
	GOROOT/src/runtime/debug/stack.go:24 +0x64
go.uber.org/nilaway/accumulation.run.func1()
	external/org_uber_go_nilaway/accumulation/analyzer.go:85 +0x44
panic({0x102a6ec40?, 0x140023e73e0?})
	GOROOT/src/runtime/panic.go:914 +0x218
go.uber.org/nilaway/inference.newPrimitivizer.func1({{{0x14000f34c00, 0x34}, 0x123, 0x124, 0x1}, {0x1400116a150, 0x21}, {0x14000efd338, 0x16}, 0x0, ...}, ...)
	external/org_uber_go_nilaway/inference/primitive.go:157 +0x218
go.uber.org/nilaway/inference.(*InferredMap).OrderedRange(...)
	external/org_uber_go_nilaway/inference/inferred_map.go:97
go.uber.org/nilaway/inference.newPrimitivizer(0x14001b0ec30)
	external/org_uber_go_nilaway/inference/primitive.go:150 +0x274
go.uber.org/nilaway/inference.NewEngine(0x14001b0ec30, {0x102ae65c0?, 0x140022ec810})
	external/org_uber_go_nilaway/inference/engine.go:61 +0x2c
go.uber.org/nilaway/accumulation.run(0x14001b0ec30)
	external/org_uber_go_nilaway/accumulation/analyzer.go:118 +0x2e4
main.(*action).execOnce(0x140001e74a0)
	external/io_bazel_rules_go/go/tools/builders/nogo_main.go:353 +0x784
sync.(*Once).doSlow(0x6d2f782f72657669?, 0x6972642f6f676e6f?)
	GOROOT/src/sync/once.go:74 +0x100
sync.(*Once).Do(...)
	GOROOT/src/sync/once.go:65
main.(*action).exec(...)
	external/io_bazel_rules_go/go/tools/builders/nogo_main.go:280
main.execAll.func1(0x2e6e6f6974636173?)
	external/io_bazel_rules_go/go/tools/builders/nogo_main.go:274 +0x74
created by main.execAll in goroutine 74
	external/io_bazel_rules_go/go/tools/builders/nogo_main.go:272 +0x4c
 (nilaway)

✅ nilaway v0.0.0-20231204220708-2f6a74d7c0e2 works correctly with:

  • rules_go v0.42
  • golang.org/x/tools v0.11.0
  • go.mongodb.org/mongo-driver v0.12.2

❌ nilaway v0.0.0-20231204220708-2f6a74d7c0e2 panics with:

  • rules_go v0.43
  • golang.org/x/tools v0.15.0
  • go.mongodb.org/mongo-driver v0.12.2

✅ nilaway v0.0.0-20231204220708-2f6a74d7c0e2 works correctly with:

  • rules_go v0.43 with x-tools.patch
  • golang.org/x/tools v0.11.0
  • go.mongodb.org/mongo-driver v0.12.2

✅ nilaway v0.0.0-20231204220708-2f6a74d7c0e2 works correctly with latest master of rules_go:

  • rules_go b4b04b8dcb221655f64d7eb345ca53b797967d68 with x-tools.patch
  • golang.org/x/tools v0.11.0
  • go.mongodb.org/mongo-driver v0.12.2

Finally here's the permalink to the line in the mongo-driver package that is apparently causing the panic: https://github.com/mongodb/mongo-go-driver/blob/1f344bd232e6dd3eb70f9d1df3e82cb532191200/mongo/cursor.go#L292

@yuxincs
Copy link
Contributor

yuxincs commented Dec 5, 2023

I can confirm this problem is reproduced and it seems to be related to the x/tools upgrade (this problem exists since v0.11.1 to the latest version).

Currently you can pin the version x/tools to v0.11.0 in go.mod file: https://go.dev/ref/mod#go-mod-file-replace to avoid this panic. We will investigate further and raise a PR for a fix (contributions welcome!)

@yuxincs yuxincs added the bug Something isn't working label Dec 5, 2023
@philipuvarov
Copy link

philipuvarov commented Apr 2, 2024

Hey there! I have encountered the same issue today. Is there a possible workaround for it? x/tools version 0.18.0 here

@philipuvarov
Copy link

Checked it on 0.19.0 and rules_go 0.46.0 with the same results. I have also went through the source of the rules_go a bit and there does not seem to be an easy way to patch it seems. We have tried running as a standalone checker, but the overhead becomes prohibitively large(somehow I was hitting 50GB+ RAM usage from nilaway on my machine hehe).

@yuxincs @sonalmahajan15 do you currently have any plans on looking into this one?

PS / off-topic. It would be nice to have some guidelines on how(if possible) to run the plugin locally, I think it would simplify investigations and make contributions easier. I am thinking if something similar to --override_repository= is possible.

@yuxincs
Copy link
Contributor

yuxincs commented Apr 5, 2024

Hi @philipuvarov, we have investigated the issue and the root cause was that the x/tools library applied an optimization to not export facts of the transitive dependencies. Instead, it now only exports the facts about a package's direct dependencies. This is nice optimization for performance and storage. However, it broke NilAway's assumption since we carefully implemented an export logic that only stores the incremental parts (i.e., the new knowledge we gained after analysis of a particular package).

Say we have package dependencies like foo -> bar -> baz, now when NilAway analyzes package baz, it won't be able to see facts about foo anymore, and the facts we store for bar is an incremental knowledge over package foo.

Naively storing all facts is too expensive in our experiment, and we're investigating smarter ways to tackle this 😃

In the meantime, you can apply this patch that basically disables the "panic" you're seeing: it is not a real panic, it is an artificial safe guard we have placed in NilAway when facts from upstream are incomplete (i.e., cases like this).

diff --git a/inference/primitive.go b/inference/primitive.go
index 495dec6..ca360e0 100644
--- a/inference/primitive.go
+++ b/inference/primitive.go
@@ -154,10 +154,11 @@ func newPrimitivizer(pass *analysis.Pass) *primitivizer {

 			objRepr := site.PkgPath + "." + string(site.ObjectPath)
 			if existing, ok := upstreamObjPositions[objRepr]; ok && existing != site.Position {
-				panic(fmt.Sprintf(
-					"conflicting position information on upstream object %q: existing: %v, got: %v",
-					objRepr, existing, site.Position,
-				))
+				// panic(fmt.Sprintf(
+				// 	"conflicting position information on upstream object %q: existing: %v, got: %v",
+				// 	objRepr, existing, site.Position,
+				// ))
+				return true
 			}
 			upstreamObjPositions[objRepr] = site.Position
 			return true

This would disable the panic, but meanwhile due to the incomplete knowledge, false negatives might happen. It's definitely less than ideal (which is why we didn't put this patch into the main branch), and we will try to find a better approach soon.

somehow I was hitting 50GB+ RAM usage from nilaway

If you restrict the analysis only to 1st party code, would it still behave that way? (see instructions)

PS / off-topic. It would be nice to have some guidelines on how(if possible) to run the plugin locally, I think it would simplify investigations and make contributions easier. I am thinking if something similar to --override_repository= is possible.

Yeah, this is a bit tricky. I tried this some time ago but to no avail. --override_repository didn't seem to work since it expects the repository to be managed by bazel (e.g., rules_go), but for plain Go repositories we need gazelle to also play along, which didn't seem to be the case (happy to be proved wrong 😃 ). I was also trying adding a replace directive in go.mod and let gazelle pick it up, but didn't succeed for reasons I can't remember.

PRs more than welcome if you've got better ideas here, as always.

@philipuvarov
Copy link

@yuxincs thanks a lot and sorry for the late response!

If you restrict the analysis only to 1st party code, would it still behave that way? (see instructions)

So funny thing about this, actually it was generated code(protobuf and gRPC) that was causing RAM consumption to go absolutely bonkers, even when running with Nogo. I did remove proto packages with the exclude_pkgs, which made the RAM consumption to be more reasonable, however, I noticed that the build was still taking a lot of time(i.e. target that used to take seconds was taking ~3-5min).

So after that I have removed generated code packages from the Nogo instrumentation, something like this:

go_register_nogo(
    excludes = [
+        "@//my/package/proto:__subpackages__",
    ],
    includes = [
+        "@//my/package:__subpackages__",
    ],
)

This has brought down build times back to normal. But also, after excluding generated code from the Nogo, the issue with the upstream panic has gone away.

Not sure what or why, but also @yuxincs if you think it might be valuable documenting this Nogo thing(if not for panic, but for performance reasons), I can make a small PR to mention it in the docs(however, it is more Nogo, then NilAway).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants