Difference between revisions of "Generated c code"

From Liberty Eiffel Wiki
Jump to navigation Jump to search
m (15 revisions: initial import from SamrtEiffel Wiki - The Grand SmartEiffel Book)
(No difference)

Revision as of 22:04, 3 March 2013

*** I have partially updated the examples here: the hello_world.id file was generated by se2.2beta5 and I
*** have noted some statements that are now obsolete in view of recent compiler changes.
*** This article and its French counterpart need revising by the development team.
*** You can also use the original article

People who want to interface with applications and/or libraries written in C should as far as possible limit themselves to the interfaces provided by Cecil, external and plugins. This page gives details about the generated C code, but these details ought not to be of any use to you. ;-)

The type identifiers

Description

SmartEiffel generates one unique identifying number for each active type in the Eiffel code1. A lot of symbols in the generated C code depend on that identifier.

Don't depend on those identifiers! The mangling table is only valid for one specific compilation of one specific application with one specific compiler version, for one compiler compile_to_c in one specific version and libraries in one specific version... We do not guarantee the stabbility of these identifiers.

If 27 is an identifier, then:

  • The C type of an Eiffel object is T27;
  • The corresponding C structure is struct S27; in that structure, the names of the attributes are prefixed with an underscore (there may be some other fields used by SmartEiffel, in particular, the id field, which in this case has the value 27).
    Each reference type can be cast to T0 (although some reference types may not have the id field).
  • Each method is called r27method_name()
  • Each prefix method is called r27_px_method_name()
  • Each infix method is called r27_ix_method_name()
  • Each late-binding method is called X27method_name()
  • The object's creation method (when the garbage collector is used) is called new27()
  • The type model is a variable called M27. Models are used for initialisation. For example:
T27 M27 = {27,NULL,NULL,0,0};
...
{T27*n=((T27*)se_malloc(sizeof(*n)));
 *n=M27;
 r27make(n);
...

Some characters in method names, such as "<" or "+", will be replaced by their corresponding ASCII code in decimal.

An example

For example: STRING has the identifier 72. So:

  • The object type is T7.
typedef struct S7 T7;
  • The structure is defined in struct S7.
struct S7{Tid id;T9 _storage;t2 _count;t2 _capacity;};
  • The append method becomes:
void r7append(se_dump_stack*caller,T7* C,T0* a1)

(See below for details on the dump stack execution stack.)

The .id file

When the application is compiled, the list of identifiers is stored in a file whose name is suffixed .id. The file is reread in incremental compilations (which allows some stability in the identifiers, so long as the whole project is not recompiled).

This file is structured in entries, each entry being separated from others by a hash character (#). This file looks like this:

5 "REAL"
class-path: "/SmartEiffel/lib/numeric/real.e"
class-name: REAL
assertion-level: all
parent-count: 1 inherit: 49
#
72 "FAST_ARRAY"
class-path: "/SmartEiffel/lib/storage/collection/fast_array.e"
class-name: FAST_ARRAY
assertion-level: all
parent-count: 1 inherit: 58
#
62 "STD_ERROR"
class-path: "/SmartEiffel/lib/io/terminal/std_error.e"
class-name: STD_ERROR
assertion-level: all
parent-count: 1 inherit: 39
c-type: T62 reference: yes
ref-status: live id-field: yes
destination-graph-nodes: ->OUTPUT_STREAM
run-time-set-count: 1
run-time-set:
        STD_ERROR (62)
#
. . .

You should never depend on these identifiers. In any case, when an identifier is computed, collisions may occur, and affect the process. Thus, the identifier and name of each type depends not only on the type name, but also on the order in which the types are compiled. That is, on the order of application and library types combined... They also depend on the compilation mode used (since that can change the list of active types), and the version of the compiler you're using. So what is T145 today may be T234 tomorrow3!

Consequently, do not ever rely on the generated identifiers, because they are not constant! Do not try to write in your own C code horrible things like new123 or T456, because the only thing we can guarantee is that this code will not work.

Naming convention

The preceding section has explained how methods are generated.

The function prototype r7append() from the example above is presented as

v7append(se_dump_stack*caller,T7* C,T0* a1);

This shows how Current and the arguments are passed. The rules are as follows:

  • Current is called C and is always strongly typed (with its own exact type). This parameter may be omitted if Current is not used by the method. In some cases (when code is inlined), Current can be copied in local variables named C1, C2...
  • The arguments are called a1, a2... and are typed T0* for reference types or given their exact type for expanded types (e.g. T2 for an integer)
  • Inside functions, a local variable R is defined. This is Result.
  • Local variable keep their Eiffel name, prefixed by an underscore.

Agent routines

Functions of the form _T111C222l333c444 are agent creations.

222 is the id of the class where the agent creation was written (look it up in the .id file); 333 is the line number; 444 is the column number.

111 is a little more difficult to explain. It is the id of the type from which this agent creation will be executed. It can be the same as 222, but it will be different if 222 corresponds to a generic class, and it can also be different if a class inherits from 222.

For example, function _T832C832l1362c106 was declared in class #832, on line 1362 column 106.

The mangling table

OK, now you understand why you cannot use type numbers, but you still want to know what those fields in the mangling table mean (in the .id file)>..

First, a big caveat. Although it may have been very stable for quite some time now, the mangling table coding may change! We currently have no plans to change it, and we prefer keeping it the way it is. But once again, we do not commit ourselves to the current representation.

Let's look again at the extract of a .id file. The part shown covers nearly all the possible cases:

5 "REAL"
class-path: "/SmartEiffel/lib/numeric/real.e"
class-name: REAL
assertion-level: all
parent-count: 1 inherit: 49
#
72 "FAST_ARRAY"
class-path: "/SmartEiffel/lib/storage/collection/fast_array.e"
class-name: FAST_ARRAY
assertion-level: all
parent-count: 1 inherit: 58
#
62 "STD_ERROR"
class-path: "/SmartEiffel/lib/io/terminal/std_error.e"
class-name: STD_ERROR
assertion-level: all
parent-count: 1 inherit: 39
c-type: T62 reference: yes
ref-status: live id-field: yes
destination-graph-nodes: ->OUTPUT_STREAM
run-time-set-count: 1
run-time-set:
       STD_ERROR (62)
#
. . .

There is one entry per type (active or not); each entry spans many lines and is terminated by a hash symbol (#).

Each entry contains a lot of information. Not all of it is always present; missing entries take default values.

Only the first line is compulsory. It contains the type identifier, and its name (as would be returned by generating_type).

The following lines contain different fields, marked by a keyword, a colon and a value. There may be one or more fields on a single line. Those fields are:

class-path The path to the file containing the source code. May be omitted if the class has no associated file (uncommon).
class-name The name of the class, as returned by generator.
parent-count Le number of parents.
inherit On the same line as parent-count if the latter is not null; it gives the list of parent class identifiers.
c-type The C type, usually in the form T27. If it is omitted, the class has no runnable type.

In that case, the following fields do not appear either.

reference On the same line as c-type, yes for a reference type or no for an expanded type.
ref-status Either live for an active type (i.e. instances of this type are created at run-time), or dead otherwise.
id-field On the same line as ref-status, yes if the id field has to be generated in the C structure (as its first element), no otherwise. This field is present if one of these confitions is true:
  • some late binding may occur on targets of that type,
  • or the structure may be accessed by an external or by cecil.

Note that a lot of calls are statically computed; the type inference algorithm used in SmartEiffel increases the number of such types that do not need the id field.

destination-graph-nodes ?????
run-time-set-count The number of concrete, active descendants of the type (including itself). This is the number of items in run-time-set below.
run-time-set The concrete, active heirs of this type (including itself). One class per line, tab-indented.

The dump stack

When not in boost mode, a stack is managed by the runtime environment generated by SmartEiffel. This stack is displayed when an uncaught exception is raised. It is also used by the debugger sedb.

Technically, the SmartEiffel stack is built upon the native (C) stack. Each stack element is a se_dump_stack4 usually allocated on the stack5. It is made up of several parts:

  • a frame descriptor, of type se_frame_descriptor4 which is a static description of the feature, as follows:
    • run type,
    • does it use Current,
    • number and type of the local variables, and
    • an anti-recursion flag (for contracts)...
  • a dynamic part:
    • a pointer to Current (i.e. either a pointer to an expanded object, or else a pointer to a reference object, it being a pointer to the object itself). That is why the type of the current field is Void**. This field will be NULL if Current is not used by the feature,o
    • the position (used mainly by sedb),
    • a pointer to the caller (i.e. the se_dump_stack of the calling function),
    • a pointer to the exception origin: if not NULL, it means that this se_dump_stack is not in the stack, but was malloc'ed to preserve the exception stack trace,
    • an array of local variables (with double indirection as for Current), hence the type void***.

Macros handle the linking between the se_dump_stack frames.

  • Normally, the top of the dump stack is the global variable se_dst defined in SmartEiffel/sys/runtime/no_check.c. The macro set_dump_stack_top accomplishes the assignment of its argument in this variable.
  • In SCOOP mode, each processor has its own stack. So the set_dump_stack_top macro has two arguments: the processor and the new dump stack top.



1. There is a bijection (a one to one relationship) between the number and the name of the type (including the value of its generic parameters in the case of generic types).
2. There are some identifiers that are reserved for "basic" types. They are:

So it is likely that the root class of the system may have the identifier 12. But do not rely on that too much (before all the changes to INTEGERs, it was 11).

3. The compiler will do its best not to change the identifiers uselessly. The .id file is loaded at the beginning of the compilation process, and saved again at its end. But clean, for example, erases that file.
4. You can find the definition of these structures in the SmartEiffel/sys/runtime/c/no_check.h file.
5. The exception is when an exception is raised: in that case, part of the stack is allocated onto the heap before the longjmp.