Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression when upgrading to 1.1.2 or greater: "make install" breaks gRPC native builds on certain macos CI jobs because ginstall corrupts the shared library #210

Closed
apolcyn opened this issue Jan 19, 2023 · 6 comments

Comments

@apolcyn
Copy link

apolcyn commented Jan 19, 2023

In grpc/grpc#32089, I'm upgrading the gRPC gem's rake-compiler dependency from <= 1.1.1 to ~> 1.2.1 in order to add ruby 3.2 pre-compiled gems.

I encountered one strange regression that only happened on certain macos CI jobs, though. This problematic job does a native build of the grpc extension binary just in order to run unit tests - it simply runs rake compile to do that.

However, unit tests were failing to load the resulting src/ruby/lib/grpc/grpc_c.bundle file. They were getting (not a mach-o file) errors like the following:

<internal:/Users/kbuilder/.rvm/rubies/ruby-3.2.0/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require': dlopen(/Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/lib/grpc/grpc_c.bundle, 0x0009): tried: '/Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/lib/grpc/grpc_c.bundle' (not a mach-o file) - /Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/lib/grpc/grpc_c.bundle (LoadError)
	from <internal:/Users/kbuilder/.rvm/rubies/ruby-3.2.0/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from /Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/lib/grpc/grpc.rb:22:in `<top (required)>'
	from /Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/lib/grpc.rb:19:in `require_relative'
	from /Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/lib/grpc.rb:19:in `<top (required)>'
	from <internal:/Users/kbuilder/.rvm/rubies/ruby-3.2.0/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from <internal:/Users/kbuilder/.rvm/rubies/ruby-3.2.0/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from /Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/end2end/end2end_common.rb:24:in `<top (required)>'
	from src/ruby/end2end/client_memory_usage_test.rb:17:in `require_relative'
	from src/ruby/end2end/client_memory_usage_test.rb:17:in `<main>'

I added some logging and observed that the file command also did not recognize the grpc_c.bundle file as a mach-o file:

+ file src/ruby/lib/grpc/grpc_c.bundle
src/ruby/lib/grpc/grpc_c.bundle: data

I tracked the problem down to what looks like a buggy ginstall command:

The Makefile that's generated from out extconf.rb contains an install: target that looks like the following on these faulty jobs:

usr/local/opt/coreutils/bin/ginstall -c -m 0755 grpc_c.bundle /Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_ruby_macos_dbg_native/src/ruby/lib/grpc

^ that command is supposed to simply copy tmp/x86_64-darwin21/grpc_c/2.7.7/grpc_c.bundle to src/ruby/lib/grpc/grpc_c.bundle. But running a diff shows that the ginstall command is somehow modifying the file:

+ diff src/ruby/lib/grpc/grpc_c.bundle tmp/x86_64-darwin21/grpc_c/2.7.7/grpc_c.bundle
Binary files src/ruby/lib/grpc/grpc_c.bundle and tmp/x86_64-darwin21/grpc_c/2.7.7/grpc_c.bundle differ

Weirdly, this bad ginstall modification only happens when tmp/x86_64-darwin21/grpc_c/2.7.7/grpc_c.bundle is not stripped.

I was able to work around this with an ugly hack to our extconf.rb that avoids using ginstall within the make install target:

puts 'Overriding the generated Makefile install target to use cp'
  File.open('Makefile.new', 'w') do |o|
    File.foreach('Makefile') do |i|
      if i.start_with?('INSTALL_PROG = ')
        override = 'INSTALL_PROG = cp'
        puts "Replacing generated Makefile line: |#{i}|, with: |#{override}|"
        o.puts override
      else
        o.puts i
      end
    end
  end
  File.rename('Makefile.new', 'Makefile')

Filing this issue here to see if this is a known issue or if there are better workarounds. I think this is a bug in ginstall, but it's hard to be sure since I haven't been able to reproduce locally or get more debug output from ginstall.

@apolcyn apolcyn changed the title Regression when upgrading to 1.1.2+: "make install" breaks gRPC native builds on certain macos CI jobs because ginstall corrupts the shared library Regression when upgrading to 1.1.2 or greater: "make install" breaks gRPC native builds on certain macos CI jobs because ginstall corrupts the shared library Jan 19, 2023
@flavorjones
Copy link
Contributor

I tried to reproduce this on my (pretty vanilla) macos development machine and cannot. The command run for me is

/usr/bin/install -c -m 0755 grpc_c.bundle /Users/flavorjones/code/grpc/src/ruby/lib/grpc

Looking at Ruby's mkmf.rb (which generates the Makefile), this is the interesting line that chooses the install program:

INSTALL = #{config_string('INSTALL', &possible_command) || '@$(RUBY) -run -e install -- -vp'}
INSTALL_PROG = #{config_string('INSTALL_PROG') || '$(INSTALL) -m 0755'}

and the install-so target uses $(INSTALL_PROG).

The config_string method is going to look in the RbConfig::MAKEFILE_CONFIG configuration. Is seems likely that the Ruby you're using to build the gem in the test environment is configured to use ginstall. If you change that to use the default install instead, is this problem avoided?

@flavorjones
Copy link
Contributor

Ah, OK, I re-ran my repro attempt with Ruby 3.2 (previous run was with Ruby 3.1) and I'm also seeing:

/opt/homebrew/bin/ginstall -c -m 0755 grpc_c.bundle /Users/flavorjones/code/grpc/src/ruby/lib/grpc

but I can't reproduce what you're seeing with the modification of the file.

Also, I looked in the configuration file for Ruby and there doesn't appear to be an option to set the install program. It chooses ginstall, scoinst, install in that order. So I don't know if there's a workaround other than to override the value of RbConfig::MAKEFILE_CONFIG["INSTALL"] in the extconf.rb.

@kou
Copy link
Member

kou commented Jan 21, 2023

@kou
Copy link
Member

kou commented Aug 2, 2023

Can we close this as "no feedback"?

@flavorjones
Copy link
Contributor

@kou I think that's right.

@kou
Copy link
Member

kou commented Aug 3, 2023

Let's close this.

@kou kou closed this as not planned Won't fix, can't repro, duplicate, stale Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants