Difference between revisions of "Generated c code"
Hzwakenberg (talk | contribs) m (Link to Loria file no longer valid) |
|||
(5 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | [[Category:Smarteiffel]] |
||
*** I have partially updated the examples here: the hello_world.id file was generated by se2.2beta5 and I |
*** I have partially updated the examples here: the hello_world.id file was generated by se2.2beta5 and I |
||
*** have noted some statements that are now obsolete in view of recent compiler changes. |
*** have noted some statements that are now obsolete in view of recent compiler changes. |
||
*** This article and its French counterpart need revising by the development team. |
*** This article and its French counterpart need revising by the development team. |
||
− | |||
− | *** You can also use [http://SmartEiffel.loria.fr/man/c_code.html the original article] |
||
People who want to interface with applications and/or libraries written in C should as far as possible limit themselves to the interfaces provided by [[Cecil]], [[externals|external]] and [[plugins]]. This page gives details about the generated C code, but these details ought not to be of any use to you. ;-) |
People who want to interface with applications and/or libraries written in C should as far as possible limit themselves to the interfaces provided by [[Cecil]], [[externals|external]] and [[plugins]]. This page gives details about the generated C code, but these details ought not to be of any use to you. ;-) |
||
Line 12: | Line 9: | ||
=== Description === |
=== Description === |
||
− | + | Liberty Eiffel generates one unique identifying number for each active type in the Eiffel code<sup>[[Generated c code#note1|1]]</sup>. A lot of symbols in the generated C code depend on that identifier. |
|
− | '''Don't depend on those identifiers!''' The mangling table is only valid for one specific compilation of one specific application with one specific compiler version, for one compiler [[compile_to_c]] in one specific version and libraries in one specific version... We do not guarantee the |
+ | '''Don't depend on those identifiers!''' The mangling table is only valid for one specific compilation of one specific application with one specific compiler version, for one compiler [[compile_to_c]] in one specific version and libraries in one specific version... We do not guarantee the stability of these identifiers. |
If ''27'' is an identifier, then: |
If ''27'' is an identifier, then: |
||
* The C type of an Eiffel object is '''T27'''; |
* The C type of an Eiffel object is '''T27'''; |
||
− | * The corresponding C structure is '''struct S27'''; in that structure, the names of the attributes are prefixed with an underscore (there may be some other fields used by |
+ | * The corresponding C structure is '''struct S27'''; in that structure, the names of the attributes are prefixed with an underscore (there may be some other fields used by Liberty Eiffel, in particular, the <TT>id</TT> field, which in this case has the value 27). <br> Each reference type can be cast to <TT>T0</TT> (although some reference types may not have the <TT>id</TT> field). |
* Each method is called '''r27''method_name''()''' |
* Each method is called '''r27''method_name''()''' |
||
* Each prefix method is called '''r27_px_''method_name''()''' |
* Each prefix method is called '''r27_px_''method_name''()''' |
||
Line 118: | Line 115: | ||
OK, now you understand why you cannot use type numbers, but you still want to know what those fields in the mangling table mean (in the <TT>.id</TT> file)>.. |
OK, now you understand why you cannot use type numbers, but you still want to know what those fields in the mangling table mean (in the <TT>.id</TT> file)>.. |
||
− | First, a big caveat. Although it may have been very stable for quite some time now, '''the mangling table coding may change'''! We currently have no plans to change it, and we prefer keeping it the way it is. But once again |
+ | First, a big caveat. Although it may have been very stable for quite some time now, '''the mangling table coding may change'''! We currently have no plans to change it, and we prefer keeping it the way it is. But once again: we do not commit ourselves to the current representation. |
Let's look again at the extract of a <TT>.id</TT> file. The part shown covers nearly all the possible cases: |
Let's look again at the extract of a <TT>.id</TT> file. The part shown covers nearly all the possible cases: |
||
Line 199: | Line 196: | ||
== The dump stack == |
== The dump stack == |
||
− | '''When not in boost mode''', a stack is managed by the runtime environment generated by |
+ | '''When not in boost mode''', a stack is managed by the runtime environment generated by Liberty Eiffel. This stack is displayed when an uncaught exception is raised. It is also used by the debugger [[sedb]]. |
− | Technically, the |
+ | Technically, the Liberty Eiffel stack is built upon the native (C) stack. Each stack element is a <TT>se_dump_stack</TT><sup>[[Generated c code#note4|4]]</sup> usually allocated on the stack<sup>[[Generated c code#note5|5]]</sup>. It is made up of several parts: |
* a frame descriptor, of type <TT>se_frame_descriptor</TT><sup>[[Generated c code#note4|4]]</sup> which is a static description of the feature, as follows: |
* a frame descriptor, of type <TT>se_frame_descriptor</TT><sup>[[Generated c code#note4|4]]</sup> which is a static description of the feature, as follows: |
||
Line 243: | Line 240: | ||
So it is likely that the root class of the system may have the identifier '''12'''. But do not rely on that too much (before all the changes to INTEGERs, it was '''11'''). |
So it is likely that the root class of the system may have the identifier '''12'''. But do not rely on that too much (before all the changes to INTEGERs, it was '''11'''). |
||
− | All [[ |
+ | All [[library_class:NATURAL|<tt>NATURAL</TT>]] classes ([[library_class:NATURAL|NATURAL]], [[library_class:NATURAL_8|NATURAL_8]], [[library_class:NATURAL_16|NATURAL_16]], [[library_class:NATURAL_32|NATURAL_32]], [[library_class:NATURAL_64|NATURAL_64]] and [[library_class:NATURAL_GENERAL|NATURAL_GENERAL]]) do not have a fixed id. |
Latest revision as of 13:39, 2 July 2024
*** I have partially updated the examples here: the hello_world.id file was generated by se2.2beta5 and I *** have noted some statements that are now obsolete in view of recent compiler changes. *** This article and its French counterpart need revising by the development team.
People who want to interface with applications and/or libraries written in C should as far as possible limit themselves to the interfaces provided by Cecil, external and plugins. This page gives details about the generated C code, but these details ought not to be of any use to you. ;-)
The type identifiers
Description
Liberty Eiffel generates one unique identifying number for each active type in the Eiffel code1. A lot of symbols in the generated C code depend on that identifier.
Don't depend on those identifiers! The mangling table is only valid for one specific compilation of one specific application with one specific compiler version, for one compiler compile_to_c in one specific version and libraries in one specific version... We do not guarantee the stability of these identifiers.
If 27 is an identifier, then:
- The C type of an Eiffel object is T27;
- The corresponding C structure is struct S27; in that structure, the names of the attributes are prefixed with an underscore (there may be some other fields used by Liberty Eiffel, in particular, the id field, which in this case has the value 27).
Each reference type can be cast to T0 (although some reference types may not have the id field). - Each method is called r27method_name()
- Each prefix method is called r27_px_method_name()
- Each infix method is called r27_ix_method_name()
- Each late-binding method is called X27method_name()
- The object's creation method (when the garbage collector is used) is called new27()
- The type model is a variable called M27. Models are used for initialisation. For example:
T27 M27 = {27,NULL,NULL,0,0}; ... {T27*n=((T27*)se_malloc(sizeof(*n))); *n=M27; r27make(n); ...
Some characters in method names, such as "<" or "+", will be replaced by their corresponding ASCII code in decimal.
An example
For example: STRING has the identifier 72. So:
- The object type is T7.
typedef struct S7 T7;
- The structure is defined in struct S7.
struct S7{Tid id;T9 _storage;t2 _count;t2 _capacity;};
- The append method becomes:
void r7append(se_dump_stack*caller,T7* C,T0* a1)
(See below for details on the dump stack execution stack.)
The .id file
When the application is compiled, the list of identifiers is stored in a file whose name is suffixed .id. The file is reread in incremental compilations (which allows some stability in the identifiers, so long as the whole project is not recompiled).
This file is structured in entries, each entry being separated from others by a hash character (#). This file looks like this:
5 "REAL" class-path: "/SmartEiffel/lib/numeric/real.e" class-name: REAL assertion-level: all parent-count: 1 inherit: 49 # 72 "FAST_ARRAY" class-path: "/SmartEiffel/lib/storage/collection/fast_array.e" class-name: FAST_ARRAY assertion-level: all parent-count: 1 inherit: 58 # 62 "STD_ERROR" class-path: "/SmartEiffel/lib/io/terminal/std_error.e" class-name: STD_ERROR assertion-level: all parent-count: 1 inherit: 39 c-type: T62 reference: yes ref-status: live id-field: yes destination-graph-nodes: ->OUTPUT_STREAM run-time-set-count: 1 run-time-set: STD_ERROR (62) # . . .
You should never depend on these identifiers. In any case, when an identifier is computed, collisions may occur, and affect the process. Thus, the identifier and name of each type depends not only on the type name, but also on the order in which the types are compiled. That is, on the order of application and library types combined... They also depend on the compilation mode used (since that can change the list of active types), and the version of the compiler you're using. So what is T145 today may be T234 tomorrow3!
Consequently, do not ever rely on the generated identifiers, because they are not constant! Do not try to write in your own C code horrible things like new123 or T456, because the only thing we can guarantee is that this code will not work.
Naming convention
The preceding section has explained how methods are generated.
The function prototype r7append() from the example above is presented as
v7append(se_dump_stack*caller,T7* C,T0* a1);
This shows how Current and the arguments are passed. The rules are as follows:
- Current is called C and is always strongly typed (with its own exact type). This parameter may be omitted if Current is not used by the method. In some cases (when code is inlined), Current can be copied in local variables named C1, C2...
- The arguments are called a1, a2... and are typed T0* for reference types or given their exact type for expanded types (e.g. T2 for an integer)
- Inside functions, a local variable R is defined. This is Result.
- Local variable keep their Eiffel name, prefixed by an underscore.
Agent routines
Functions of the form _T111C222l333c444 are agent creations.
222 is the id of the class where the agent creation was written (look it up in the .id file); 333 is the line number; 444 is the column number.
111 is a little more difficult to explain. It is the id of the type from which this agent creation will be executed. It can be the same as 222, but it will be different if 222 corresponds to a generic class, and it can also be different if a class inherits from 222.
For example, function _T832C832l1362c106 was declared in class #832, on line 1362 column 106.
The mangling table
OK, now you understand why you cannot use type numbers, but you still want to know what those fields in the mangling table mean (in the .id file)>..
First, a big caveat. Although it may have been very stable for quite some time now, the mangling table coding may change! We currently have no plans to change it, and we prefer keeping it the way it is. But once again: we do not commit ourselves to the current representation.
Let's look again at the extract of a .id file. The part shown covers nearly all the possible cases:
5 "REAL" class-path: "/SmartEiffel/lib/numeric/real.e" class-name: REAL assertion-level: all parent-count: 1 inherit: 49 # 72 "FAST_ARRAY" class-path: "/SmartEiffel/lib/storage/collection/fast_array.e" class-name: FAST_ARRAY assertion-level: all parent-count: 1 inherit: 58 # 62 "STD_ERROR" class-path: "/SmartEiffel/lib/io/terminal/std_error.e" class-name: STD_ERROR assertion-level: all parent-count: 1 inherit: 39 c-type: T62 reference: yes ref-status: live id-field: yes destination-graph-nodes: ->OUTPUT_STREAM run-time-set-count: 1 run-time-set: STD_ERROR (62) # . . .
There is one entry per type (active or not); each entry spans many lines and is terminated by a hash symbol (#).
Each entry contains a lot of information. Not all of it is always present; missing entries take default values.
Only the first line is compulsory. It contains the type identifier, and its name (as would be returned by generating_type).
The following lines contain different fields, marked by a keyword, a colon and a value. There may be one or more fields on a single line. Those fields are:
class-path | The path to the file containing the source code. May be omitted if the class has no associated file (uncommon). |
class-name | The name of the class, as returned by generator. |
parent-count | Le number of parents. |
inherit | On the same line as parent-count if the latter is not null; it gives the list of parent class identifiers. |
c-type | The C type, usually in the form T27. If it is omitted, the class has no runnable type.
In that case, the following fields do not appear either. |
reference | On the same line as c-type, yes for a reference type or no for an expanded type. |
ref-status | Either live for an active type (i.e. instances of this type are created at run-time), or dead otherwise. |
id-field | On the same line as ref-status, yes if the id field has to be generated in the C structure (as its first element), no otherwise. This field is present if one of these confitions is true:
Note that a lot of calls are statically computed; the type inference algorithm used in SmartEiffel increases the number of such types that do not need the id field. |
destination-graph-nodes | ????? |
run-time-set-count | The number of concrete, active descendants of the type (including itself). This is the number of items in run-time-set below. |
run-time-set | The concrete, active heirs of this type (including itself). One class per line, tab-indented. |
The dump stack
When not in boost mode, a stack is managed by the runtime environment generated by Liberty Eiffel. This stack is displayed when an uncaught exception is raised. It is also used by the debugger sedb.
Technically, the Liberty Eiffel stack is built upon the native (C) stack. Each stack element is a se_dump_stack4 usually allocated on the stack5. It is made up of several parts:
- a frame descriptor, of type se_frame_descriptor4 which is a static description of the feature, as follows:
- run type,
- does it use Current,
- number and type of the local variables, and
- an anti-recursion flag (for contracts)...
- a dynamic part:
- a pointer to Current (i.e. either a pointer to an expanded object, or else a pointer to a reference object, it being a pointer to the object itself). That is why the type of the current field is Void**. This field will be NULL if Current is not used by the feature,o
- the position (used mainly by sedb),
- a pointer to the caller (i.e. the se_dump_stack of the calling function),
- a pointer to the exception origin: if not NULL, it means that this se_dump_stack is not in the stack, but was malloc'ed to preserve the exception stack trace,
- an array of local variables (with double indirection as for Current), hence the type void***.
Macros handle the linking between the se_dump_stack frames.
- Normally, the top of the dump stack is the global variable se_dst defined in SmartEiffel/sys/runtime/no_check.c. The macro set_dump_stack_top accomplishes the assignment of its argument in this variable.
- In SCOOP mode, each processor has its own stack. So the set_dump_stack_top macro has two arguments: the processor and the new dump stack top.
- 0: the type any other can be cast to
- 1: INTEGER_8
- 2: INTEGER (or INTEGER_32)
- 3: CHARACTER
- 4: REAL_32
- 5: REAL_64 (or REAL)
- 6: BOOLEAN
- 7: STRING
- 8: POINTER
- 9: NATIVE_ARRAY[CHARACTER]
- 10: INTEGER_16
- 11: INTEGER_64
- 12: REAL_EXTENDED
So it is likely that the root class of the system may have the identifier 12. But do not rely on that too much (before all the changes to INTEGERs, it was 11).
All NATURAL classes (NATURAL, NATURAL_8, NATURAL_16, NATURAL_32, NATURAL_64 and NATURAL_GENERAL) do not have a fixed id.