Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #2061 (new defect)

Opened 13 years ago

Last modified 13 years ago

"Illegal Instruction" when building on newer CPU and then running on older CPU

Reported by: Abscissa Assigned to: community
Priority: major Milestone: 1.0
Component: Tango Version: 0.99.9 Kai
Keywords: Cc:

Description

blank.d:

void main()
{
}

If I compile that on a newer CPU, then copy the binary to an older CPU (even with the same version of Linux) it will fail with the message "Illegal Instruction". It works fine if I use Phobos instead of Tango (so it doesn't seem to be a compiler issue). Even Tango's own bob binary is affected: On the older CPU, it immediately bails with "Illegal Instruction".

The specifics steps I took to test:

1. I started on a newer machine: 32-bit Linux (Kubuntu 10.04), Intel Pentium Dual-Core M T2370 (dual-core, 64-bit, SSSE3).

2. I installed DMD 1.066 with Phobos and ran:

$ dmd blank.d -ofphobos
$ ./phobos
$

4. I installed DMD 1.066 with Tango and ran:

$ dmd blank.d -oftango
$ ./tango
$

5. I copied the binaries to an older machine with the same operating system: 32-bit Linux (Kubuntu 10.04), AMD Athlon XP (i686, 32-bit, single core, SSE1), and ran:

$ ./phobos
$
$ ./tango
Illegal Instruction
$

Change History

07/13/11 20:45:13 changed by larsivi

Phobos vs Tango isn't likely to be a relevant differentiator, I think. Without the actual illegal instruction here, it will be rather difficult to say for sure (it could be some assembly code in Tango), but normally I would say that this is a compiler bug. Tango and Phobos are so different that some code could in Tango could codegen something never exposed through Phobos, or not for your given test programs.

07/14/11 20:13:21 changed by Abscissa

Yea, I suppose that is possible.

The actual source and binaries are here:

http://www.semitwist.com/download/app/dtests/

Note that I did do more tests then just what I mentioned in this ticket, but the files that turned out to be relevant are:

blank.d d1.066phobos_blank d1.066tango_blank

FWIW, All the phobos binaries worked on my older system, all the tango ones gave me "Illegal Instrution". The C ones, oddly enough, just gave me a dynamic linking error on the old system (I forget the exact message). Again, you're right that that could still be a compiler bug that just happened to be exposed by Tango and not Phobos.

07/14/11 20:29:53 changed by Abscissa

My formatting on that list of relevent files got messed up, trying again:

blank.d
d1.066phobos_blank
d1.066tango_blank

Here's a screendump of a debugging session on the offending binary:

$ gdb d1.066tango_blank 
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/nick/dev/d/proj/dtests/d1.066tango_blank...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/nick/dev/d/proj/dtests/d1.066tango_blank 
[Thread debugging using libthread_db enabled]

Program received signal SIGILL, Illegal instruction.
0x08075ab3 in _D5tango4core4sync6Atomic31__T13memoryBarrierVb1Vb0Vi0Vb0Z13memoryBarrierFZv ()
(gdb) disass
Dump of assembler code for function _D5tango4core4sync6Atomic31__T13memoryBarrierVb1Vb0Vi0Vb0Z13memoryBarrierFZv:
   0x08075ab0 <+0>:     push   %ebp
   0x08075ab1 <+1>:     mov    %esp,%ebp
=> 0x08075ab3 <+3>:     lfence 
   0x08075ab6 <+6>:     pop    %ebp
   0x08075ab7 <+7>:     ret    
End of assembler dump.

So it looks like the offending instruction is "lfence" inside the function _D5tango4core4sync6Atomic31__T13memoryBarrierVb1Vb0Vi0Vb0Z13memoryBarrierFZv.

07/14/11 21:17:07 changed by larsivi

If on IRC, try asking Fawzi about this? I'm currently on vacation and a bit restricted in terms of online activities.

07/30/11 08:11:34 changed by Abscissa

FWIW, the "lfence" and "mfence" instructions used in tango.core.sync.Atomic exist only in CPUs with SSE2, which rules out all 32-bit AMD Athlons and all P3's (But "sfence" exists all the way back in the Pentium 3).

The instructions are included via a version(D_InlineAsm_X86) block, so I'm unsure why the instructions are omitted when compiling *on* a CPU that doesn't support them. Maybe DMD (or ld?) is smart enough to get rid of the instruction if the local machine doesn't support it?

In any case, a person shouldn't be required to actually use an older system just to build an executable that works on older systems. This currently causes problems for both Tango's bob and DVM.

I'm thinking maybe the content of tango.core.sync.Atomic (or maybe tango.core.Thread and tango.core.ThreadPool?: the only two users of tango.core.sync.Atomic) should be put in a template, templated on "bool sse2". At application startup, the sse2-ability could be detected and function pointers set to the appropriate sse2==true or sse2=false versions. And if the performance of that indirection is a worry, then it could all just be circumvented with an AssumeSSE2 version identifier (or maybe self-modifying code?). I may give this a try myself if I get a chance.

Sources:

http://courses.engr.illinois.edu/ece390/books/labmanual/inst-ref-general.html#INST-REF-LFENCE

http://en.wikipedia.org/wiki/Athlon

http://en.wikipedia.org/wiki/Sse2#Notable_IA-32_CPUs_not_supporting_SSE2