This content originally appeared on DEV Community and was authored by Dmitry Daw
Let's say you got a "Segmentation fault" error in Ruby
[BUG] Segmentation fault at 0x0000000000000028
ruby 3.4.0preview1 (2024-05-16 master 9d69619623) [x86_64-linux-musl]
-- Machine register context ------------------------------------------------
RIP: 0x00007fefe4cd4886 RBP: 0x0000000000000001 RSP: 0x00007fefc95d3a10
RAX: 0x0000000000000001 RBX: 0x00007fefc94212e0 RCX: 0x00007fefc95d0b70
RDX: 0x0000000000000010 RDI: 0x0000000000000000 RSI: 0x00007fefc95d08f0
R8: 0x0000000000000000 R9: 0x0000000000000000 R10: 0x0000000000000000
R11: 0x0000000000000217 R12: 0x00007fefc9421340 R13: 0x00007fff5a0ec750
R14: 0x00007fefe4649b10 R15: 0x00007fefc95d3b38 EFL: 0x0000000000010202
-- Other runtime information -----------------------------------------------
...
0x0000000000000028 near zero points out that something is NULL, but that's not much. To get more info, you could run your program under gdb
/app # gdb -q --args ruby test.rb
(gdb)
Here we are in a debugger. To run your program write run
(gdb) run
Starting program: /usr/local/bin/ruby test.rb
warning: Error disabling address space randomization: Operation not permitted
[New LWP 36]
[New LWP 37]
[New LWP 38]
execution expired
Thread 4 "ruby" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 38]
0x00007f0a2c33b886 in freeaddrinfo (p=0x0) at src/network/freeaddrinfo.c:10
warning: 10 src/network/freeaddrinfo.c: No such file or directory
Okay, we see there is something with system's src/network/freeaddrinfo.c
.
Let's get backtrace and check variables
(gdb) bt
#0 0x00007f0a2c33b886 in freeaddrinfo (p=0x0) at src/network/freeaddrinfo.c:10
#1 0x00007f0a10c1e940 in do_getaddrinfo (ptr=0x7f0a10f61200) at raddrinfo.c:426
#2 0x00007f0a2c35c349 in start (p=0x7f0a10afaa88) at src/thread/pthread_create.c:207
#3 0x00007f0a2c35e95f in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC
(gdb) info args
p = 0x0
Okay, now we see that the problem comes from ruby's raddrinfo.c
, and there is argument p
that is NULL.
We could also go up in the stack, and check variables
(gdb) info locals
cnt = 1
b = <optimized out>
(gdb) frame 1
#1 0x00007f1068ec6940 in do_getaddrinfo (ptr=0x7f1068cf0c40) at raddrinfo.c:426
warning: 426 raddrinfo.c: No such file or directory
(gdb) info args
ptr = 0x7f1068cf0c40
(gdb) info locals
arg = 0x7f1068cf0c40
err = <optimized out>
gai_errno = <optimized out>
need_free = 0
Now we're prepared to look what is happening in the code. Lets check raddrinfo.c
, line 426
// ext/socket/raddrinfo.c
...
if (arg->cancelled) {
freeaddrinfo(arg->ai);
}
...
Indeed freeaddrinfo
is called.
Now could make a bug in https://bugs.ruby-lang.org/, or try to debug it by yourself.
I've tried :)
We're on Alpine, on ruby:3.3.3-alpine
. And on different system, e.g. ruby:3.3.3
is all okay, so it should be something with Alpine.
Some search tells us that indeed: freeaddrinfo in Alpine's musl library does not accept NULL pointer(link), in difference with glibc(which is used e.g. in Ubuntu)(link)
So let's fix it.
Firstly we need to build ruby. For convenience lets create a small Dockerfile
FROM alpine:3.20
WORKDIR /usr/src/app
RUN apk update && apk add autoconf gcc build-base ruby ruby-dev openssl openssl-dev yaml-dev zlib-dev yaml gdb
CMD sh
Clone ruby
$ git clone --depth 1 git@github.com:ruby/ruby.git
$ cd ruby
Go inside our alpine docker container
$ docker build -t my-ruby-develop .
$ docker run -it --rm -v $(pwd):/usr/src/app -w /usr/src/app my-ruby-develop sh
And build the latest ruby version - to check the bug is still present.
$ mkdir build && cd build
$ mkdir rubies
$ ../configure --prefix="/usr/src/myapp/build/rubies/ruby-master" && make && make install
Now we have our latest ruby in /usr/src/myapp/build/rubies/ruby-master
folder.
Let's check the problem is still exist
/app # /usr/src/app/build/rubies/ruby-master/bin/ruby test.rb
Operation timed out - user specified timeout
[BUG] Segmentation fault at 0x0000000000000028
-- Machine register context ------------------------------------------------
RIP: 0x00007f561acf6886 RBP: 0x0000000000000001 RSP: 0x00007f55ff5d2a10
RAX: 0x0000000000000001 RBX: 0x00007f55ff43ff30 RCX: 0x00007f55ff5cfb70
RDX: 0x0000000000000010 RDI: 0x0000000000000000 RSI: 0x00007f55ff5cf8f0
R8: 0x0000000000000000 R9: 0x0000000000000000 R10: 0x0000000000000000
R11: 0x0000000000000217 R12: 0x00007f55ff43ff90 R13: 0x00007f55ff236040
R14: 0x00007f55ff236b38 R15: 0x00007f55ff5d2b38 EFL: 0x0000000000010202
It is for sure.
Then we need to
- check in which case the problem is happening
- find if it should be changed inside Alpine or Ruby
- are there other places that could be related to the same problem
- make a reproducible example etc etc - typical engineering work.
In our case, it should be changed inside ruby. Let's make the change
// ext/socket/raddrinfo.c
...
if (arg->cancelled) {
if (arg->ai) freeaddrinfo(arg->ai);
}
...
Make the ruby(with make clean
first) and try again. It works!
/app # make clean && ../configure --prefix="/usr/src/myapp/build/rubies/ruby-master" && make && make install
/app # /usr/src/app/build/rubies/ruby-master/bin/ruby test.rb
Good
Great! Now we can run the tests, for related file, and the whole set
$ make test-all TESTS=../test/socket/test_addrinfo.rb
$ make test-all
$ make test-spec
And if possible - write a test for your change(in this case it is hard to write a reliable test because of getaddrinfo internals).
Don't forget to describe a bug in ruby's bugtracker, and attach all useful info(e.g. https://bugs.ruby-lang.org/issues/20592)
And voila! You made the world a bit better https://github.com/ruby/ruby/commit/fba8aff7af450e476e97b62385427dfa51850955
Links
- https://github.com/ruby/ruby/wiki/How-To-Contribute How to contribute
- https://github.com/ruby/ruby/wiki/Developer-How-To Developer How To
- https://github.com/ko1/rubyhackchallenge/blob/master/EN/4_bug.md How to work with bugs in ruby
- https://docs.ruby-lang.org/en/master/contributing/building_ruby_md.html How to build ruby
- https://docs.ruby-lang.org/en/master/contributing/testing_ruby_md.html How to test ruby
- https://github.com/ko1/rubyhackchallenge/blob/master/bib.md Materials on Ruby internals
This content originally appeared on DEV Community and was authored by Dmitry Daw
Dmitry Daw | Sciencx (2024-06-22T19:16:22+00:00) How to fix a segfault in Ruby. Retrieved from https://www.scien.cx/2024/06/22/how-to-fix-a-segfault-in-ruby/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.