# The Odin Programming Language Specification

**Warning:**This is a work in progress and is subject to change.

## Introduction

This is a reference manual for the Odin programming language.

Odin is a general-purpose language designed for systems programming. It is a strongly typed language with manual memory management. Programs are constructed from *packages*.

## Notation

The syntax is specified using Extended Backus-Naur Form (EBNF):

```
Production = production_name "=" [ Expression ] "." .
Expression = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term = production_name | token [ "…" token ] | Group | Option | Repetition .
Group = "(" Expression ")" .
Option = "[" Expression "]" .
Repetition = "{" Expression "}" .
```

Productions are expressions constructed from terms and the following operators, in increasing precedence:

```
| alternation
() grouping
[] option (0 or 1 times)
{} repetition (0 to n times)
```

## Source code representation

Source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two separate code points. In this document, the term *character* will be used to refer to a Unicode code point in the source text.

Each code point is distinct; there is case sensitivity.

Implementation restriction: A compile *must* disallow the NUL character (U+0000) in the source text.
Implementation restriction: A compile *may* ignore a UTF-8-encoded byte order mark (U+FEFF) if it is the first Unicode code point in the source text. A byte order mark *must* be disallowed anywhere else in the source text.

### Characters

The following terms are used to denote specific Unicode character classes:

```
newline = /* the Unicode code point U+000A */
unicode_char = /* an arbitrary Unicode code point except newline */
unicode_letter = /* a Unicode code point classified as "Letter" */
unicode_digit = /* a Unicode code point classified as "Number, decimal digit" */
```

In The Unicode Standard 8.0, Section 4.5 “General Category” defines a set of character categories. Odin treats all characters in any of the Letter categories Lu, Ll, Lt, Lm, or Lo as Unicode letters, and those in the Number category Nd as Unicode digits.

### Letters and digits

The underscore character `_`

(U+005F) is considered a letter.

```
letter = unicode_letter | "_" .
binary_digit = "0" … "1" .
octal_digit = "0" … "7" .
decimal_digit = "0" … "9" .
dozenal_digit = "0" … "9" | "A" … "B" | "a" … "b" .
hex_digit = "0" … "9" | "A" … "F" | "a" … "f" .
binary_char = binary_digit | "_" .
octal_char = octal_digit | "_" .
decimal_char = decimal_digit | "_" .
dozenal_char = dozenal_digit | "_" .
hex_char = hex_digit | "_" .
```

## Lexical elements

### Comments

Comments serve as program documentation. There are three forms:

*Line comments*start with the character sequence`//`

and stop at the end of the line*General comments*start with the character sequence`/*`

and stop with a pairing character sequence`*/`

to allow for nested general comments*Hash-bang comments*start with the character sequence`#!`

and stop at the end of the line

A comment cannot start inside a *rune* or *string* literal, or inside a line or hash-bang comment.

### Tokens

Tokens form the vocabulary of the Odin language. There four classes: *identifiers*, *keywords*, *operators* and *punctuation*, and *literals*. *White space*, formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and new lines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token

### Semicolons

The formal grammar uses semicolons `;`

as terminators. Odin programs may omit these semicolons under the following rules:

- followed by one of the operators and punctuation on the same line:
`}`

,`)`

- followed by one of the keywords on the same line:
`else`

- preceded by one of the statements: block, if, when, for, switch
- preceded by one of the declarations: package, import, foreign import, foreign block
- the last expression in a constant value declaration is one of the expressions followed by a new line: procedure literal
- the last expression in a constant value declaration is one of the types followed by a new line: helper type of [6], pointer type of [6], struct type, union type, enum type, bit field type

### Identifiers

Identifiers name program entities such as variables and types. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter.

```
identifier = letter { letter | unicode_digit } .
```

Some identifiers are predeclared.

### Keywords

The following keywords are reserved and may not be used as identifiers

```
align_of case defer enum import no_inline proc transmute when
auto_cast cast distinct fallthrough in notin return type_of
bit_field const do for inline offset_of size_of typeid
bit_set context dynamic foreign macro opaque struct union
break continue else if map package switch using
```

Some keywords are currently not used by the language but are just reserved for future use.

### Operators and punctuation

The following character sequences represent operators (including assignment operators) and punctuation:

```
+ & += &= && == != ( ) # ->
- | -= |= || < <= [ ] @ <-
* ~ *= ~= &&= > >= { } $ =>
/ << /= <<= ||= = .. , ; ? ---
% >> %= >>= ! ..< . :
%% &~ %%= &~=
```

Implementation option: A compiler may allow the following character sequences as aliases for other operators, punctuation, and keywords:

```
'≠' (U+2260) alias for '!='
'≤' (U+2260) alias for '<='
'≥' (U+2260) alias for '>='
'∈' (U+2260) alias for 'in'
'∉' (U+2260) alias for 'notin'
```

### Integer literals

An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a specific radix: `0b`

for binary, `0o`

for octal, `0d`

for decimal, `0z`

for dozenal, or `0x`

for hexadecimal. In dozenal literals, letters `a-b`

and `A-B`

represents values ten through eleven. In hexadecimal literals, letters `a-f`

and `A-F`

represents values ten through fifteen.

Integer literals may contain any amount of the underscore character `_`

(U+005F) within the literal after the first character.

```
int_lit = binary_lit | octal_lit | decimal_lit | dozenal_lit | hex_lit
binary_lit = "0b" binary_digit { binary_char } .
octal_lit = "0o" octal_digit { octal_char } .
decimal_lit = ["0d"] decimal_digit { decimal_char } .
dozenal_lit = "0z" dozenal_digit { dozenal_char } .
hexadecimal_lit = "0x" hex_digit { hex_char } .
```

```
42
042 // == 42
0b1001011
0o712
0d42
0z19b3
0xDeadBeef
210206826754181103207028761697008013415622289
210_206_826_754_181_103_207_028_761_697_008_013_415_622_289
```

### Floating-point literals

A floating-point literal is a textual representation of a floating-point constant. There are two forms of floating-point literals decimal and hexadecimal. Hexadecimal floating-point literals represent the internal integer representation of the floating-point number for that platform.

```
float_lit = decimal_float_lit | hexadecimal_float32_lit | hexadecimal_float64_lit .
decimal_float_lit = decimals "." [decimals] [exponent] |
decimals exponent |
"." decimals [exponent] .
decimals = decimal_digit { decimal_char } .
exponent = ( "e" | "E" ) [ "+" | "-" ] decimals .
hexadecimal_float32_lit = "0h" hex_digit hex_char hex_char hex_char
hex_char hex_char hex_char hex_char .
hexadecimal_float64_lit = "0h" hex_digit hex_char hex_char hex_char
hex_char hex_char hex_char hex_char
hex_char hex_char hex_char hex_char
hex_char hex_char hex_char hex_char .
```

```
0.
0.0
42.36
042.36 // == 42.36
6.28318530718
1.e+0
1.054571800e-34
1.054_571_800e-34
1E9
.125
.12345e+5
```

### Imaginary literals

An imaginary literal is a decimal representation of the imaginary part of a complex constant. It consists of a floating-point literal or a decimal integer followed by the lower-case letter `i`

.

```
imaginary_lit = (decimals | float_lit) "i" .
```

```
0.i
0.0i
42.36i
042.36i // == 42.36i
6.28318530718i
1.e+0i
1.054571800e-34i
1.054_571_800e-34i
1E9i
.125i
.12345e+5i
```

### Rune literals

A rune literal represents a rune constant, an integer value identifying a Unicode code point. A rune literal is expressed as one or more characters enclosed in single quotes, such as `'b'`

or `'\t'`

.

TODO:

```
\a U+0007 alert or bell
\b U+0008 backspace
\e U+001B escape
\f U+000C form feed
\n U+000A newline or line feed
\r U+000D carriage return
\t U+0009 horizontal tab
\v U+000B vertical tab
\\ U+005C backslash
\' U+0027 single quote (valid escape only within rune literals)
\" U+0022 double quote (valid escape only within string literals)
```

```
rune_lit = "'" ( unicode_value | byte_value ) "'" .
unicode_value = unicode_char | little_u_value | big_u_value | escaped_char .
byte_value = octal_byte_value | hex_byte_value .
octal_byte_value = `\` octal_digit octal_digit octal_digit .
hex_byte_value = `\` "x" hex_digit hex_digit .
little_u_value = `\` "u" hex_digit hex_digit hex_digit hex_digit .
big_u_value = `\` "U" hex_digit hex_digit hex_digit hex_digit
hex_digit hex_digit hex_digit hex_digit .
escaped_char = `\` ( "a" | "b" | "e" | f" | "n" | "r" | "t" | "v" | `\` | "'" | `"` ) .
```

### String literals

TODO:

```
string_lit = raw_string_lit | interpreted_string_lit .
raw_string_lit = "`" { unicode_char | newline } "`" .
interpreted_string_lit = `"` { unicode_value | byte_value } `"` .
```

## Constants

## Variables

## Types

A type determines a set of values together with operations specific to those values. A type may be denoted by a *type name*, if it has one, or a specified using a *type literal*, which composing a type from other existing types.

```
Type = TypeName | TypeLit | "(" Type ")" | HelperType .
TypeName = identifier | QualifiedIdent .
TypeLit = ArrayType | SliceType | DynamicArrayType | StructType | UnionType |
PointerType | ProcedureType | MapType | EnumType |
BitSetType | BitFieldType | OpaqueType | HelperType
```

### Boolean types

A boolean type represents the set of boolean truth values denoted by the predeclared constants `true`

and `false`

. The predeclared architecture-independent boolean types are:

```
bool 1 byte boolean type
b8 8-bit boolean type
b16 16-bit boolean type
b32 32-bit boolean type
b64 64-bit boolean type
```

### Numeric types

A numeric type represents sets of integer, floating-point, or rune values. The predeclared architecture-independent numeric types are:

```
u8 the set of all unsigned 8-bit integers (0 to 255)
u16 the set of all unsigned 16-bit integers (0 to 65535)
u32 the set of all unsigned 32-bit integers (0 to 4294967295)
u64 the set of all unsigned 64-bit integers (0 to 18446744073709551615)
u128 the set of all unsigned 128-bit integers (0 to 340282366920938463463374607431768211455)
i8 the set of all signed 8-bit integers (-128 to 127)
i16 the set of all signed 16-bit integers (-32768 to 32767)
i32 the set of all signed 32-bit integers (-2147483648 to 2147483647)
i64 the set of all signed 64-bit integers (-9223372036854775808 to 9223372036854775807)
i128 the set of all signed 128-bit integers (-170141183460469231731687303715884105728 to 170141183460469231731687303715884105727)
f32 the set of all IEEE-754 32-bit floating-point numbers
f64 the set of all IEEE-754 64-bit floating-point numbers
complex64 the set of all complex numbers with float32 real and imaginary parts
complex128 the set of all complex numbers with float64 real and imaginary parts
byte alias for u8
rune the set of all Unicode code points represented by a 32-bit integer (-2147483648 to 2147483647)
```

The value of an n-bit integer is n bits wide and represented using two’s complement arithmetic.

There is also a set of architecture-independent numeric types with a specified endianess:

```
u16le little endian representation of the set of all unsigned 16-bit integers (0 to 65535)
u32le little endian representation of the set of all unsigned 32-bit integers (0 to 4294967295)
u64le little endian representation of the set of all unsigned 64-bit integers (0 to 18446744073709551615)
u128le little endian representation of the set of all unsigned 128-bit integers (0 to 340282366920938463463374607431768211455)
i16le little endian representation of the set of all signed 16-bit integers (-32768 to 32767)
i32le little endian representation of the set of all signed 32-bit integers (-2147483648 to 2147483647)
i64le little endian representation of the set of all signed 64-bit integers (-9223372036854775808 to 9223372036854775807)
i128le little endian representation of the set of all signed 128-bit integers (-170141183460469231731687303715884105728 to 170141183460469231731687303715884105727)
u16be big endian representation of the set of all unsigned 16-bit integers (0 to 65535)
u32be big endian representation of the set of all unsigned 32-bit integers (0 to 4294967295)
u64be big endian representation of the set of all unsigned 64-bit integers (0 to 18446744073709551615)
u128be big endian representation of the set of all unsigned 128-bit integers (0 to 340282366920938463463374607431768211455)
i16be big endian representation of the set of all signed 16-bit integers (-32768 to 32767)
i32be big endian representation of the set of all signed 32-bit integers (-2147483648 to 2147483647)
i64be big endian representation of the set of all signed 64-bit integers (-9223372036854775808 to 9223372036854775807)
i128be big endian representation of the set of all signed 128-bit integers (-170141183460469231731687303715884105728 to 170141183460469231731687303715884105727)
```

There is also a set of predeclared numeric types with implementation-specific sizes:

```
uintptr an unsigned integer large enough to store the uninterpreted bits of a pointer value
uint same size as uint
int same size as uint
```

To avoid portability issues for all numeric types are defined types and thus distinct, except `byte`

which is an alias for `u8`

. Explicit conversions are required when different numeric types are mixed in an expression or assignment. For instance, `i64`

and `int`

are not the same type even though they may have the same size on a particular machine.

### String types

A *string* type represents the set of string values. A string value is a (possibly empty) sequence of bytes. The number of bytes is called the length of the string and is a non-negative integer.

The predeclared strings types are:

```
string
cstring
```

The length of a string `s`

can be determined using the built-in procedure `len`

. The length is a compile-time constant if the string is a constant. A string’s bytes can only be accessed, for string types not derived from `cstring`

, by integer indices 0 through `len(s)-1`

.

### Array types

An array is a numbered sequence of elements of a single type, called the element type. The number of element is a called the length of the array and is a non-negative integer.

```
ArrayType = "[" ArrayLength "]" ElementType .
ArrayLength = Expression .
ElementType = Type .
```

The length is part of the array’s type; it must evaluate to a non-negative constant representable by a value of type `int`

. The length of array `a`

be determined using the built-in procedure `len`

. The elements can be address by indices 0 through `len(a)-1`

. Array types are always one-dimensional but may be composed to form multi-dimensional types.

```
[32]byte
[2*N + 1]union{int, string}
[42]^f32
[2][3]int
[3][3][3]f64 // same as [3]([3]([3]f64))
```

### Slice types

A slice is a descriptor for a contiguous segment of *underlying array/memory* and provides access to an indexed sequence of elements from that array/memory. A slice type denotes the set of all slices of arrays of its element type. The number of elements is called the length of the slice and is never negative. The value of an uninitialized slice is `nil`

.

```
SliceType = "[" "]" ElementType .
```

The length of a slice `s`

can be determined by the built-in procedure `len`

. Unlike with regular arrays, this length is not a compile-time constant and must be determined during execution. The elements can be addressed by integer indicies `0`

through `len(s)-1`

.

### Dynamic array types

```
DynamicArrayType = "[" "dynamic" "]" ElementType .
```

### Struct types

A struct is a sequence of named elements, called fields, each of which has a name and a type. Within a struct, non-blank field names must be unique.

```
StructType = "struct" StructTypeTags "{" FieldList "}" .
FieldList = FieldDecl { "," FieldDecl } .
FieldDecl = [ [using] IdentifierList ":" ] Type .
StructTypeTags = { (StructTypeTagPacked | StructTypeTagRawUnion | StructTypeTagAlign) } .
StructTypeTagPacked = "#" "packed" .
StructTypeTagRawUnion = "#" "raw_union" .
StructTypeTagAlign = "#" "align" Expression .
```

```
// An empty struct
struct {}
// A struct with 5 fields
struct {
x, y: int,
f: f32,
_: f32, // padding
a: ^[]int,
p: proc() -> int,
}
// A struct with no padding between its fields
struct #packed {
a: u8,
b: u16,
c: u32,
d: u8,
}
```

The `#packed`

tag states the memory representation of the struct type to have no padding between its fields.

The `#raw_union`

tag states the memory representation of the struct type to have the size of its largest in size field and alignment of its largest in alignment member, and have all fields be accessed by the same memory offset of zero.

The `#raw_union`

tag and `#packed`

tag cannot be combined together.

The `#align`

tag explicitly states the alignment required for the struct type.

### Union types

### Pointer types

TODO: `rawptr`

### Procedure types

### Map types

### Enum types

### Bit set types

### Bit field types

### Opaque types

### Helper types

### Any type

### Typeid

### Other

#### SIMD types

## Properties of types and values

### Type identity

Two types are either *identical* or *different*.

A distinct type is always different from any other type. Otherwise, type types are identical if their underlying type literals are structurally equivalent; that is, they have the same literal structure and corresponding components have identical types. In detail:

- Two array types are identical if they have identical element types and the same array length
- Two slice types are identical if they have identical element types
- Two dynamic array types are identical if they have identical element types
- Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same same names and identical types, and have identical memory layout specifications
- Two union types are identical if they have the same sequence of variant types
- Two pointer types are identical if they have identical base types
- Two procedure types are identical if they have the same number of parameters and return values, corresponding parameter and result types are identical, and either both procedures are variadic or neither is. Parameter and return names are not required to match.
- Two map types are identical if they have identical key and element types
- Two enum types are identical if and only if they are the same enum
- Two opaque types are identical if they have the same underlying base type
- Two bit field types are identical if they have the same number of fields, and if corresponding sizes and offsets are identical, and alignment is identical
- Two bit set types are identical if they have identical element types, underlying types, and have an identical value range

### Assignability

A value `x`

is assignment to a variable of type `T`

(“`x`

is assignable to `T`

”) if one of the following conditions applies:

`x`

's type is identical to`T`

`x`

's type`v`

and`T`

have identical underlying types and at least one of`v`

or`T`

is not a distinct type`x`

is the predeclared identifier`nil`

and`T`

is a pointer, procedure, slice, dynamic array, map, enum, union, bit set, bit field, opaque type,`any`

,`cstring`

or`typeid`

.`x`

is the*uninitialized value*`---`

and`T`

is a variable`x`

is an untyped constant representable by a value of type`T`

`x`

is a procedure group and one of its procedures can be assigned to`T`

`x`

's type is a pointer and`T`

is`rawptr`

`T`

is a union and`x`

's type is a variant of the union`T`

`T`

is an`any`

type and`x`

is not`nil`

nor`---`

`x`

's type is an integer and`T`

is a bit field type`x`

is a subtype of`T`

from the use of`using`

in`T`

`T`

is an array and`x`

's type`v`

is assignable to element type of`T`

### Representability

A constant `x`

is representable by a value of type `T`

if one of the following conditions applies:

`x`

is in a set of the values determined by`T`

`T`

is a floating-point type and`x`

can be rounded to`T`

's precision without overflow. Rounding uses IEEE 754 round-to-even rules but with an IEEE negative zero further simplified to an unsigned zero. Note that constant values never result in an IEEE negative zero, NaN, or infinity`T`

is a complex type, and`x`

's components`real(x)`

and`imag(x)`

are representable by the values of`T`

's component type (`f32`

or`f64`

)

## Blocks

A *block* is a possibly empty sequence of declarations and statements within matching brace brackets.

```
Block = "{" StatementList "}" .
StatementList = { Statement ";" } .
```

In addition to explicit blocks in the source text, there are implicit blocks:

- The
*universal block*encompasses all Odin source text. - Each package has a
*package block*containing all Odin source text for that package. - Each file has a
*file block*containing all Odin source text in that file. - Each “if”, “for”, “switch” statement is considered to be in its own implicit block.

Blocks nest and influence scoping.

## Declarations and scope

### Label scopes

Labels are declared by labeled statements and used in the “break” and “continue” statements. In contrast to other identifiers, labels are not block scoped. The scope of the label is the body of the procedure in which it is declared and excludes the body of any nested procedure.

### Blank identifier

The *blank identifier* is represented by the underscore character `_`

. It acts as an anonymous placeholder rather than a regular non-blank identifier. It has a special meaning in declarations and in assignments.

### Predeclared identifiers

```
Types:
bool b8 b16 b32 b64
byte complex64 complex128
f32 f64
int i8 i16 i32 i64
i16le i32le i64le i16be i32be i64be
uint uintptr u8 u16 u32 u64
u16le u32le u64le u16be u32be u64be
any rawptr
rune string cstring
Constants:
true false
Zero value:
nil
Procedures:
len cap
complex real imag conj
swizzle expand_to_tuple
min max abs clamp
```

## Attributes

## Expressions

### Operands

### Qualified identifiers

### Composite literals

### Procedure literals

### Selectors

#### Implicit selector

### Index expressions

### Slice expressions

### Type assertions

### Calls

### Passing argument to variadic parameters

### Operators

### Operators

#### Operator precedence

Unary operators have the highest precedence.

There are eight precedence levels for binary operators.

```
Precedence Operator
8 & / % %% << >> & &~
7 + - | ~
6 'in' 'notin'
5 == != < <= > >=
4 &&
3 ||
2 .. // If allowed
1 ? // Ternary expression
```

Binary operators of the same precedence associate from left to right. For instance, `x / y * z`

is the same as `(x / y) * z`

.

### Arithmetic operators

```
+ sum integers, floats, complex values, constant strings values
- difference (binary) integers, floats, complex values
- negation (unary) integers, floats, complex values
* product integers, floats, complex values
/ quotient integers, floats, complex values
% modulo dividend integers
%% modulo divisor integers
& bitwise AND integers
| bitwise OR integers
~ bitwise XOR (binary) integers
~ bitwise NOT (unary) integers
&~ bit clear (AND NOT) integers
<< left shift integer << unsigned integer
>> right shift integer >> unsigned integer
```

Arithmetic operators also work on fixed-length arrays of numeric types.

#### Integer operators:

For two integer values `x`

and `y`

, the integer quotient `q = x / y`

and remainder `r = x % y`

satisfy the following relationships:

```
x = q*y + r and |r| < |y|
```

with `x / y`

truncated towards zero ("truncated division").

### Logical operators

### Address operators

### Conversion

A conversion changes the type of an expression to the type specified by the conversion. A conversion may appear literally in the source, or it may be implied by the context in which an expression appears.

An *explicit* conversion is an expression of the form `T(x)`

or `cast(T)x`

where `T`

is a type and `x`

is an expression that can be converted to type `T`

.

```
Conversion = CallConversion | CastConversion .
CallConversion = Type "(" Expression [ ", "] ")" .
CastConversion = "cast" "(" Type ")" Expression .
```

#### Transmute

### Constant expressions

### Order of evaluation

## Statements

### Labelled statements

### Expression statements

### Assignments

### If statements

### Switch statements

### For statements

### Return statements

### Break statements

### Continue statements

### Fallthrough statements

### Defer statements

### Using statements

## Built-in procedures

### Length and capacity

### Manipulating complex numbers

### Numeric bounds

### Directive based

### Other

## Packages

### Source file organization

### Package clause

### Import declaration

## Foreign system

## Implicit context system

## Parametric polymorphism

## Initialization and execution

### The zero value

When storage is allocated for a variable, through a declaration, or when a new value is created through a composite literal, and no explicit initialization is provided, the variable or value is given a default value. Each element of such a variable or value is set to the *zero value* for its type.

These two simple declarations are equivalent:

```
i: int;
i: int = 0;
```

## System considerations

### Size and alignment guarantees

```
type size in bytes
byte, u8, i8, b8 1
u16, i16, u16le, i16le, b16 2
u32, i32, u32le, i32le, b32, f32, rune 4
u64, i64, u64le, i64le, b64, f64, complex64 8
complex128 16
```

Minimal alignment properties:

- For a variable
`x`

of any type,`align_of(x)`

is at least 1 - For a variable
`x`

of array type,`align_of(x)`

is the same as the alignment of a variable of the array’s element type - For a variable
`x`

of enum type,`align_of(x)`

is the same as the alignment of a variable of the enum’s base type (default of`int`

if not specified) - For a variable
`x`

of struct type,`align_of(x)`

is the largest of all the values`align_of(x.f)`

for each field`f`

of`x`

, but at least 1, unless the alignment has been explicitly stated by the struct tag`#align`

, or the alignment is 1 if the struct is declared to be`#packed`

.

A struct, bit set, bit field, array, has size zero if it contains not field (or elements) that have a size greater than zero.

A union has size zero if it contains zero variant types.