-
Notifications
You must be signed in to change notification settings - Fork 24
/
Copy pathREADME
439 lines (367 loc) · 20.7 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
Stalin - a STAtic Language ImplementatioN
Finally, a Lisp compiler that does what it should...
Jeffrey Mark Siskind
School of Electrical and Computer Engineering
Purdue University
Electrical Engineering Building, Room 330
465 Northwestern Avenue
West Lafayette IN 47907-2035 USA
voice: 765/496-3197
fax: 765/494-6440
http://www.ece.purdue.edu/~qobi
Monday 2 October 2006
INTRODUCTION
This is release 0.11 of Stalin, a global optimizing compiler for Scheme. It is
an alpha release and contains many known bugs. Not everything is fully
implemented. This is the first self-hosting release. Stalin can now compile
itself. Unlike previous releases, this release does not require you to have
Scheme->C. And it does not require you to install Infrastructure or
QobiScheme.
Stalin is an extremely efficient compiler for Scheme. It is designed to be
used not as a development tool but rather as a means to generate efficient
executable images either for application delivery or for production research
runs. In contrast to traditional Scheme implementations, Stalin is a
batch-mode compiler. There is no interactive READ-EVAL-PRINT loop. Stalin
compiles a single Scheme source file into an executable image (indirectly via
C). Running that image has equivalent semantics to loading the Scheme source
file into a virgin Scheme interpreter and then terminating its execution. The
chief limitation is that it is not possible to LOAD or EVAL new expressions or
procedure definitions into a running program after compilation. In return for
this limitation, Stalin does substantial global compile-time analysis of the
source program under this closed-world assumption and produces executable
images that are small, stand-alone, and fast.
Stalin incorporates numerous strategies for generating efficient code. Among
them, Stalin does global static type analysis using a soft type system that
supports recursive union types. Stalin can determine a narrow or even
monomorphic type for each source code expression in arbitrary Scheme programs
with no type declarations. This allows Stalin to reduce, or often eliminate,
run-time type checking and dispatching. Stalin also does low-level
representation selection on a per-expression basis. This allows the use of
unboxed base machine data representations for all monomorphic types resulting
in extremely high-performance numeric code. Stalin also does global static
life-time analysis for all allocated data. This allows much temporary
allocated storage to be reclaimed without garbage collection. Finally, Stalin
has very efficient strategies for compiling closures. Together, these
compilation techniques synergistically yield efficient object code.
Furthermore, the executable images created by Stalin do not contain
(user-defined or library) procedures that aren't called, variables and
parameters that aren't used, and expressions that cannot be reached. This
encourages a programming style whereby one creates and uses very general
library procedures without fear that executable images will suffer from code
bloat.
IMPORTANT NOTE
Contrary to any other indication in the documentation, this release does not
run on DEC/Alpha under Linux due to limitations of gcc (it is not able to
compile the file stalin-Alpha.c because it has too many global variables). It
is possible to run Stalin under Linux on DEC/Alpha when it is compiled with
Scheme->C (rather than self compiled) but I am not currently distributing that
version. Contact me if you need to run Stalin under Alpha Linux and I will
send you the Scheme->C version.
It might be possible to compile stalin-Alpha.c with the DEC C compiler under
Digital Unix. I do not have access to this environment to try this. If
anybody else does, I would appreciate any feedback as to whether this works.
DEVIATIONS
Stalin nominally accepts the language defined by The Revised^4 Report on the
Algorithmic Language Scheme (R4RS). The language accepted by Stalin is known
to deviate from R4RS in a number of ways as described below. (If you discover
additional differences, please send mail to [email protected].) Many of
these simply are features that are not (yet) implemented though I make no
commitment to fully comply with R4RS and reserve the right to deviate from
R4RS for any reason whatsoever, particularly performance.
1. Unimplemented syntax:
4.2.6 Nested quasiquotation is not supported.
appendix Macros
2. Unimplemented procedures:
6.5.5 NUMERATOR
DENOMINATOR
RATIONALIZE
MAKE-RECTANGULAR
MAKE-POLAR
REAL-PART
IMAG-PART
MAGNITUDE
ANGLE
6.10.4 LOAD
TRANSCRIPT-ON
TRANSCRIPT-OFF
3. (1.1) Tail recursion optimization is done only on self calls.
4. (6.5) The following numeric data types are not supported:
a. bignums: exact integers of arbitrary magnitude
Furthermore, exact integer arithmetic can overflow without signaling an
error and without yielding an inexact floating point number.
b. ratios: exact non-integer rationals
c. polar format complex numbers
d. exact rectangular format complex numbers
5. It is not possible to access all of the underlying C scalar types. (This
is not an incompatibility with R4RS but nonetheless important for other
reasons.)
a. No independent control over the float/double distinction.
b. No long/short chars/ints/floats/doubles
c. No unsigned chars/ints
6. (1.3.1) No R4RS compliance mode is provided.
7. Limitations of (6.5.6) STRING->NUMBER, (6.10.2) READ, and the source
program:
a. Can't parse largest negative number.
b. Can't parse polar numbers with @.
c. Can't parse rectangular numbers with i.
d. Can't parse ratios with /.
e. Can't parse numbers with embedded #.
f. Can't parse exactness with #E and #I.
g. Can't parse inexact numbers with mantissas where the digit string
before the decimal point would overflow if read as an exact number,
h. On Intel, SPARC, and SGI, the source program can't contain exact
integer constants <-536870912 or >536870911. On Intel and SPARC, the
compiler will generate incorrect C code without warning. On SGI, the
compiler might signal an exception or might generate incorrect C code
without warning. On Alpha, the source program can't contain exact
integer constants <-2305843009213693952 or >2305843009213693951. The
compiler will generate incorrect C code without warning.
8. (6.5.6) STRING->NUMBER doesn't accept the optional radix argument.
9. APPLY and CALL-WITH-CURRENT-CONTINUATION don't accept continuations or
foreign procedures as their first argument.
10. (6.2) EQV? might return #T when given two procedures that can never be
called.
(BEGIN (DEFINE (F X) (LAMBDA () X)) (EQV? (F 1) (F 2))) ==> #T
EQV? might return #T when given two fictitious pairs or degenerate vectors.
(EQV? (CONS #T #T) (CONS #T #T)) ==> #T
(EQV? (VECTOR #T) (VECTOR #T)) ==> #T
11. (6.5.5) =, <, >, <=, >= might not be transitive.
12. (6.5.6) STRING->NUMBER, READ, DISPLAY, and WRITE don't obey write/read
invariance for inexact numbers.
13. (APPLY (LAMBDA X X) y) ==> y without checking that y is a list and without
copying y. Furthermore, because of the way LIST is defined, (APPLY LIST X)
doesn't copy X.
14. (6.10.1) Closing a port that has already been closed yields undefined
behaviour rather than having no effect.
EXTENSIONS
Stalin extends R4RS in a number of ways:
1. New syntax: PRIMITIVE-PROCEDURE and FOREIGN-PROCEDURE.
2. New procedures: LIST-LENGTH, SUBLIST, SUB, LIST-APPEND, LIST-REVERSE, REF,
LIST-SET!, REF!, LIST-FILL!, FILL!, LIST-COPY, STRING->UNINTERNED-SYMBOL,
STRING-REVERSE, <<, >>, BITWISE-NOT, BITWISE-AND, BITWISE-OR,
MAKE-DISPLACED-VECTOR, SUBVECTOR, VECTOR-APPEND, VECTOR-REVERSE,
VECTOR-COPY, PANIC, POINTER?, INTEGER->STRING, INTEGER->INPUT-PORT,
INTEGER->OUTPUT-PORT, and INTEGER->POINTER.
3. New data type: pointer. ZERO? can be used to check for null pointers
(as well as null strings and null ports). INTEGER->POINTER can be used to
convert an integer address to a pointer. INTEGER->STRING,
INTEGER->INPUT-PORT, and INTEGER->OUTPUT-PORT can be used to convert an
integer address to a string, input port, or output-port, respectively.
4. New variable: ARGV is bound to a vector of strings containing the
command line arguments.
5. If the last expression executed by the program evaluates to an integer,
then its value is returned as the status code of the program. Otherwise,
the status code zero is returned.
6. Vectors are self evaluating.
7. DO iterator list can be empty.
8. Bodies can be empty.
9. The commands of DO and the expressions of COND, CASE, BEGIN, and DO are
considered bodies.
10. Bodies can have definitions intermixed with expressions.
11. Definitions are executed in order and can reference variables defined by
previous definitions.
12. Any body with just definitions treated like BEGIN.
13. Binding can be just a variable in LET, LET*, and LETREC.
14. DEFINE can take a single argument.
15. (EQV? "" "") ==> #T
(EQV? #() #()) ==> #T
16. The procedures LENGTH, APPEND, REVERSE, and COPY are generic.
17. EOF objects, input ports, and output ports are disjoint from each other
and all other data types.
18. All EOF objects are EQ?.
A future version of this document will describe all of the above extensions in
greater detail.
INSTALLATION
The Stalin compiler is written in Scheme and can compile itself. (While I
have run Stalin under other Scheme implementations such as Scheme->C, SCM, and
Gambit-C, this requires some modification of the Stalin source code.) To
compile and use Stalin, you must have a C compiler installed.
To install Stalin, do:
% uncompress stalin-0.11.tar.Z
% tar xf stalin-0.11.tar
% cd stalin-0.11
% ./build
To test Stalin, do:
% cd benchmarks
% ./compile-and-run-stalin-benchmarks
which will compile and run some sample programs including the Gabriel
benchmarks.
If you wish to compare the performance of Stalin against Scheme->C, Gambit-C,
Bigloo, and Chez (assuming that you have these systems installed) you can do:
% ./benchmark
This will compile all of the example with each of the Scheme compilers, run
each example three times, and automatically produce a file `results.tex' with
absolute CPU times for each benchmark and compiler as well as ratios of the
CPU times used by Stalin vs. the other compilers.
You might wish to put the executable `stalin' in your standard directory of
executables. You might also wish to put the man page `stalin.man' in your
standard directory of man pages. And you might wish to copy the directory
`include' to `/usr/local/stalin/include'. If you do the latter then you need
not specify the -I option when using Stalin.
Stalin comes with an experimental Emacs interface. To use this interface
you first need to compile the program pp.sc. This program must be compiled
with Scheme->C (which is not included in the Stalin distribution). Compile
this program with the following commands:
% scc -o pp pp.sc
% rm pp.c
and then put `pp' in your path. The Emacs interface is in the file
`stalin.el'. Read the instructions at the beginning of that file for an
explanation of how to use that interface.
Stalin also comes with an experimental FPI to OpenGL. This interface assumes
that you have the files `GL/gl.h', `GL/glu.h', and `GL/glut.h' in your C
compiler include path. To install this interface, do:
% ./build-gl-fpi
Please note that the OpenGL interface requires Mark Kilgard's GLUT which
doesn't come with OpenGL by default. GLUT can be obtained from:
http://reality.sgi.com/mjk_asd/glut3/glut3.html
Since Stalin is now self-hosting, this distribution comes with pre-generated
C code. The initial installation is done simply by compiling this C code.
Since the C code generated by Stalin varies depending on architectural
parameters, this distribution comes with two pre-generated versions: the file
`stalin-32.c' for 32-bit architectures and the file `stalin-Alpha.c' for the
DEC/Alpha. The `build' script automatically determines the architecture that
it is running on and copies the appropriate file to `stalin.c' and then
compiles this file. If you wish to rebuild Stalin from the Scheme sources
(given a running Stalin compiler), you can touch or modify the file `stalin.sc'
and then do:
% make
which will first generate `stalin.c' from `stalin.sc' and then generate
`stalin' from `stalin.c'. This will automatically make a version that is
appropriate for the architecture that you are running on. If you wish to
cross-generate C code for a different architecture, then do either:
% make stalin-32.c
or:
% make stalin-Alpha.c
USAGE
See the man page for how to invoke Stalin.
The Stalin distribution comes with a number of examples in the benchmarks
directory. The first example is a simple "Hello World" program in the file
`hello.sc'. To compile and run it do:
% ./make-hello
% ./hello
The second example is a version of the "Hello World" program that uses Xlib in
the file `xhello.sc'. To compile and run it, first set your DISPLAY
environment variable appropriately and then do:
% ./make-xhello
% ./xhello
then click on the window to exit. The third example illustrates how to use
the application framework that is included in QobiScheme. It is in the file
`define-application-example.sc'. To compile and run it, first set your DISPLAY
environment variable appropriately and then do:
% ./make-define-application-example
% ./define-application-example
This program illustrates simple buttons ("Quit"), on/off buttons ("Flag"),
radio buttons ("Line" and "Ellipse"), mode buttons ("A/B/C"),
increment/decrement buttons ("-K" and "+K"), keystroke accelerators (c-X c-C
and c-H), the builtin help facility, the type-in buffer with a few Emacs-like
key bindings, mouse-clickable regions, and rubber-banding (click on the
display pane to draw lines and ellipses).
FOREIGN PROCEDURE INTERFACE
Stalin has a rudimentary foreign procedure interface. The syntax
(FOREIGN-PROCEDURE (<arg type> ...) <return type> <name>)
returns a procedure. <arg type> must be one of CHAR, SIGNED-CHAR,
UNSIGNED-CHAR, SHORT, UNSIGNED-SHORT, INT, UNSIGNED, LONG, UNSIGNED-LONG,
FLOAT, DOUBLE, LONG-DOUBLE, CHAR*, FILE*, or VOID*. <return type> must be one
of CHAR, SIGNED-CHAR, UNSIGNED-CHAR, SHORT, UNSIGNED-SHORT, INT, UNSIGNED,
LONG, UNSIGNED-LONG, FLOAT, DOUBLE, LONG-DOUBLE, CHAR*, INPUT-PORT,
OUTPUT-PORT, VOID*, VOID, or NO-RETURN. <name> must be a string that
names the entry point. Calling that procedure calls the named entry point.
Foreign procedure are first-class objects and can be passed as arguments to
Scheme procedures, returned as results, and stored in and accessed from
variables, pairs, vectors, and structures. PROCEDURE? returns #T when given a
foreign procedure as its argument.
<type> C Scheme
-------------------------------------------------------
CHAR char character
SIGNED-CHAR signed char character
UNSIGNED-CHAR unsigned char character
SHORT short exact integer
UNSIGNED-SHORT unsigned short exact integer
INT int exact integer
UNSIGNED unsigned exact integer
LONG long exact integer
UNSIGNED-LONG unsigned long exact integer
FLOAT float inexact real
DOUBLE double inexact real
LONG-DOUBLE long double inexact real
CHAR* char* string
FILE* FILE* input port or output port
INPUT-PORT FILE* input port
OUTPUT-PORT FILE* output port
VOID void unspecified
NO-RETURN void unspecified
VOID* void* pointer
When a foreign procedure is created, the entry point must take the given
number of types and have its argument and return types declared to correspond
to the above mapping from <type> to C. A foreign procedure must be called with
the correct number of arguments of the correct types as specified by the above
mapping from <type> to Scheme. No automatic type conversion is done on entry
to or exit from the foreign procedure. If -Ot is not specified, a run time
error will be generated if a foreign procedure is called with the wrong number
of arguments or arguments of the wrong type. Such type checking is eliminated
if the compiler can prove that it is redundant or if the -Ot option is
specified. The foreign procedure returns a Scheme value of the type specified
by the above mapping from <type> to Scheme unless the <return type> is VOID or
NO-RETURN. If the <return type> is VOID, the returned value is unspecified.
If the <return type> is NO-RETURN then the procedure doesn't return. As an
example, the expression
((FOREIGN-PROCEDURE (CHAR*) INT "atoi") (VECTOR-REF ARGV 1))
parses the first command line argument into an exact integer.
Stalin introduces a new object type called `pointer'. Pointers are first class
objects and can be passed as arguments to Scheme procedures, returned as
results, and stored in and accessed from variables, pairs, vectors, and
structures. Pointers are disjoint from all other Scheme types. The only
primitive procedures that are defined on pointers, however, are POINTER? and
ZERO?. POINTER? takes a single argument. It returns #T if that argument is a
pointer and #F otherwise. If ZERO? is given a pointer as its argument it
returns #T if the pointer is NULL and #F otherwise. Pointers, however, can be
passed into and out of foreign procedures as values of the C type void*.
ZERO? can also be given a string or a port as its argument. It returns #T if
the string or port is NULL and #F otherwise. Note that null strings are
distinct from empty strings. Null strings and ports are never created by
ordinary Scheme procedures. They can only be created by foreign procedures and
the Scheme procedures INTEGER->STRING, INTEGER->INPUT-PORT, and
INTEGER->OUTPUT-PORT.
A foreign procedure invocation should not access a CHAR* or FILE* value that
was passed as an argument to a different invocation of the same or different
foreign procedure as that value might have been reclaimed in the interim.
PORTABILITY
Stalin should run under Solaris 1 and 2 on Sun/SPARCs (in 32-bit mode),
IRIX 4, 5, and 6 on SGI/MIPs (in 32-bit mode), Linux on Intel/x86s (both with
libc5 and glibc) and DEC/Alphas, and OSF/1 V3 and V4 on DEC/Alphas though it
has only been extensively tested on Linux on Intel/x86s and DEC/Alphas.
FUTURE PLANS
Stalin currently provides only a minimal set of features, coinciding mostly
with those specified in R4RS. Future plans include several language
extensions: macros, displaced strings, an object system with inheritance and
generic procedures, weak pointers and finalization, hash tables, a condition
system, a module system, separate compilation, and a debugger. Additional
compile-time optimizations are planned, as well improving the speed of the
compiler.
COMMUNICATION
Bug mail should be addressed to [email protected] and not to me personally.
Please include the version number (0.11) in the message. Periodic
announcements of bug fixes, enhancements, and new releases will be made to
[email protected]. Send mail to [email protected] to be
added to the [email protected] mailing list.
HOW TO OBTAIN
The current release of Stalin is available by anonymous FTP from
ftp://ftp.ecn.nec.com/qobi/stalin.tar.Z.
ACKNOWLEDGEMENTS
Rob Browning <[email protected]>, Jeffrey B. Siegal <[email protected]>,
Bengt Kleberg <[email protected]>, and Sven Hartrumpf
<[email protected]> contributed portions of the code and
documentation.
CONDITIONS
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.