ant-bootstrap@1.7.1 sometimes fails to build on core-updates

  • Done
  • quality assurance status badge
Details
4 participants
  • Björn Höfling
  • Gábor Boskovits
  • Chris Marusich
  • Ricardo Wurmus
Owner
unassigned
Submitted by
Chris Marusich
Severity
normal
C
C
Chris Marusich wrote on 14 Jan 2018 07:58
(address . bug-guix@gnu.org)
87o9lxapul.fsf@gmail.com
Hi,

At commit 1b321229f4653c5daa873813e24910789c0b2918 (i.e., the current
tip of the core-updates branch), ant-bootstrap@1.7.1 sometimes fails to
build. This package is defined in gnu/packages/java.scm, but it is not
exported (i.e., it is used privately within the module). Note that
according to 'guix refresh', currently 215 packages depend on this
package.

I tried to build this package 147 times. It failed 5 times, and it
succeeded 142 times. That's about a 3% failure rate. All 5 failures
produced the same log output, which I've attached. My machine is an
x86-64 machine.

--
Chris
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEy/WXVcvn5+/vGD+x3UCaFdgiRp0FAlpa/6IACgkQ3UCaFdgi
Rp22qA//cvuWMxHY48dPnmvyhd1B46YM6de9/e98Clo7ZOBI5VbROdraSzHd4tFg
XpauguJkOF1fnLqnYxAc3Ovq5sqcRAQAyaip1SqNcwJVKKBBmVA55e/ODIu1G1WD
/rVuBzgg1q9rqBNVSTu+GUwLl/EfMLmZpc9lwk/NZaGJGareJt7TGoKmFeOKGuIk
kZTZoltcNJFYZ2XKsqjgASl77rvIFw7qWAU8eP6jkeh4+xoIg029V4o31+bXU7Ud
4Tv+jKx9dIVcdK/fAPdpWR9RZpSvPkWDsKV953LS6D5fC+S7/Syzc7LJHh5ClFbB
/vfN9IzgKope8YC6sXLIQYH9AiNtHhgW1i1+hdGRzQVWSdsAtHJLIUhNek8YrkLl
u89Y6oCVfUPbBUuEkByE8kcFsNmJS7GM+N4WnXpY0jEVeRQSZ0r5pQew3Vmc06wG
Ncpk0GpDulJJLsVFasW8+F58Jyllnk2rtof0KscGxr80RJs28n9mLgfPeDrhf0gi
lDlovnpuVJHIhgkptQb2VJROgCuzo6SKhfIbU84JD7SChNjswhy0rREA9hR7hUy6
UHKMm22eW/7iCN1YN/h0GBM7XVt1LkkDzRVFwhgHiNCRHBpMJZ99ATdpPtEkPheC
ZsH+QiKFmTBaJK38nhJ6rDz1FIe28UIqzpUg/OFofavSVFs61ek=
=ezBb
-----END PGP SIGNATURE-----

G
G
Gábor Boskovits wrote on 14 Jan 2018 10:32
(name . Chris Marusich)(address . cmmarusich@gmail.com)(address . 30107@debbugs.gnu.org)
CAE4v=pg9Jk+xTj74zW1v1egzK_=w-JNkyXQoWyDXAdXB90No4w@mail.gmail.com
I've also seen this once. No idea so far.
I've tried to look around, but found no other mention of this issue.

2018-01-14 7:58 GMT+01:00 Chris Marusich <cmmarusich@gmail.com>:

Toggle quote (17 lines)
> Hi,
>
> At commit 1b321229f4653c5daa873813e24910789c0b2918 (i.e., the current
> tip of the core-updates branch), ant-bootstrap@1.7.1 sometimes fails to
> build. This package is defined in gnu/packages/java.scm, but it is not
> exported (i.e., it is used privately within the module). Note that
> according to 'guix refresh', currently 215 packages depend on this
> package.
>
> I tried to build this package 147 times. It failed 5 times, and it
> succeeded 142 times. That's about a 3% failure rate. All 5 failures
> produced the same log output, which I've attached. My machine is an
> x86-64 machine.
>
> --
> Chris
>
Attachment: file
G
G
Gábor Boskovits wrote on 18 Jan 2018 10:02
Another kind of failure
(address . 30107@debbugs.gnu.org)
CAE4v=pj1W=D-q4PEunrET2xPGZ-gZjHwQfC__QPJ2gHmBkb0SA@mail.gmail.com
I'm now on 6d49ca16be22e3fb95823ac1780ad9460a18b180.
I also observe another kind of failure now.

After output
Buildfile: build.xml

the build process hangs.

This is also indeterministic, however it is harder to specify the failure
rate here...
Attachment: file
B
B
Björn Höfling wrote on 26 Jan 2018 11:30
Backtrace
(address . 30107@debbugs.gnu.org)
20180126113029.16199095@alma-ubu
I managed to get a coredump and backtrace, but I'm not able to
extract any useful information. I never went that deep into C
programming. If anyone can get more out of this, attached is the
backtrace, register state and some disassembly.

Björn
Attachment: gdb.txt
B
B
Björn Höfling wrote on 3 Feb 2018 09:36
How I got the core dump
(address . 30107@debbugs.gnu.org)
20180203093626.6c927477@alma-ubu
On request, I'm writing here how I got to that coredump:

My first step was to investigate the build.sh, and I just patched it to
output the full command, stripping of the rest:

Toggle diff (106 lines)
diff --git a/bootstrap.sh b/bootstrap.x.sh
index bc54db4..f8c0720 100755
--- a/bootstrap.sh
+++ b/bootstrap.x.sh
@@ -151,18 +151,7 @@ cp src/script/antRun bin
chmod +x bin/antRun
echo ... Building Ant Distribution
-
-"${JAVACMD}" -classpath "${CLASSPATH}" -Dant.home=. $ANT_OPTS
org.apache.tools.ant.Main -emacs "$@" bootstrap -ret=$?
-if [ $ret != 0 ]; then
- echo ... Failed Building Ant Distribution !
- exit $ret
-fi
-
-
-echo ... Cleaning Up Build Directories
-
-rm -rf ${CLASSDIR}
-rm -rf bin
+echo I would do:
+echo "${JAVACMD}" -classpath "${CLASSPATH}" -Dant.home=. $ANT_OPTS
org.apache.tools.ant.Main -emacs "$@" bootstrap echo ... Done
Bootstrapping Ant Distribution

I added the patch into the package definition.

As I have learned yesterday, I could just repack the sources and use

guix --with-source=modified-ant.tar.gz ...

Anyway, I found out it calls:

/gnu/store/088bg6n5llvqn9j7d2740hhhilbqai4a-sablevm-1.13/bin/java-sablevm
-classpath build/classes:src/main:lib/xercesImpl.jar:lib/xml-apis.jar:
-Dant.home=. -Dbuild.compiler=jikes org.apache.tools.ant.Main -emacs
-Ddist.dir=/gnu/store/dxdsdsj4nz7fig92b2xjb7jf7swm5rni-ant-bootstrap-1.7.1
bootstrap

Next, I realized that my Ubuntu+Guix-on-top is eating up my core dumps:

$> cat /proc/sys/kernel/core_pattern
|/usr/share/apport/apport %p %s %c %d %P

So instead I went into my QEMU machine and continued there.

Set ulimit to unlimited:

ulimit -c unlimited

In sablevm, we need to get debugging infos into it:

Add to it's package definition's #arguments this one:

#:strip-binaries? #f

Rebuild it:

./pre-inst-env guix build -e '(@@ (gnu packages java) sablevm)'
--no-grafts --no-substitutes -K > sablevm.log 2>&1

Remove your failed builds /tmp/guix-build-* directories, if you have
any.

Then I looped through with this little bash script:

#!/bin/sh

ROUNDS=100

for i in `seq -w 0 $ROUNDS`; do
# DATE=${date}
# echo $DATE
echo -n $i..
./pre-inst-env guix build -e '(@@ (gnu packages java)
ant-bootstrap)' --no-grafts --no-substitutes --check -k -K >log-$i.log
2>&1

done;
echo

Then search in the logs:

grep Segementation log-*.log

Hopefully it finds one. Otherwise, repeat step above.
Check that it not onle Segfaults,but also has a "(core dumped)" behind
it. Otherwise, check your ulimit and corefile settings.

The coredump is in the /tmp/guix-build-ant..-n, where n coresponds to
your log-file number.

Finally, exporting the stack trace:

set logging on
set logging file backtrace.log
show logging
bt
info reg
quit


That's it.

Björn
R
R
Ricardo Wurmus wrote on 12 Feb 2018 16:15
control message for bug #30107
(address . control@debbugs.gnu.org)
E1elFpg-0005e3-Do@debbugs.gnu.org
tags 30107 fixed
close 30107
?
Your comment

This issue is archived.

To comment on this conversation send an email to 30107@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 30107
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch