February 9, 2020

\[ \newcommand\G{\mathscr{G}} \newcommand\hom{\text{Hom}} \newcommand\M{\mathscr{M}} \newcommand\bet{\rightarrow} \]

While following a course on algebra, I add the unavoidable course on the construction of the free group. The teacher being a pure algebraist, the construction was fully algebraic. It was a complicated piece of mathematics where proving some intuitive result turned out very complicated.

And I could think was that the intuition behind the construction is not that hard, you just have your set set of symbols, you had formal inverses, you take the free monoid and when there is something of the form \(aa^{-1}\) you eliminate it. And it turns out this is the approach described in Wikipedia.

But there usually is a big difference between an intuitive description and a formal one, with the proof of the relevant properties, and sometimes the most intuitive approach is the harder to formalize. I wanted to see if I could formalize the reduction approach using tools from lambda calculus, and see if it was more complicated or not. Turns out that this approach was easy enough, so I present it here.

And it provides me with a perfect exercice to start using Coq again, so I will provide some Coq snippets of the definitions, and some proofs, along the way. For simplicity sake, I will assume function extensionality for those proofs. The final code can be found here.

Let’s start by defining what’s a free group. Intuitively, a free group over a set \(X\) is the most general group in which one can inject \(X\), and no relations between the injected elements.

Can we make it more formal ? Indeed, the definition is that if you have a set \(X\), the free group \(\G(X)\) over \(X\) is the group such that for every other group \(G\), there is a bijection between \(\hom(\G(X), G)\) and \(G^X\) (the functions from \(X\) to \(G\)).

Intuitively, it means that for every group morphism, knowing the image of the generators of the free group is sufficient. The other direction is the *most general* part : no matter what images one impose on the generators, it is possible to create a group morphism that respect them.

For those interested in going further, there is a more general notion of free object, which itself comes from the idea of a free fonctor.

The basis of our construction will be the free monoid over a set \(X\) : \(\M(X)\). It is defined using the exact same definition, replacing free group by free monoid and group morphism by monoid morphism.

Let’s formalize that in Coq. First we need to define a monoid structure :

```
Record Monoid : Type := mkMon {
type : Type;
op : type -> type -> type;
empty : type;
emptyCorrect : forall (x : type), op x empty = x /\ op empty x = x;
associativity : forall (x y z : type), op x (op y z) = op (op x y) z;
}.
Definition is_monoid_morphism (M M' : Monoid) (f : type M -> type M') : Prop :=
f (empty M) = empty M' /\ forall (x y : type M), f (op M x y) = op M' (f x) (f y).
Record MonoidMorphism (M M' : Monoid) : Type := mkMonMorphism {
monmor : type M -> type M';
monmor_correct : is_monoid_morphism M M' monmor;
}.
```

Now we can define the proposition that tells us what a free monoid is :

```
Definition is_free_monoid_over (T : Type) (M : Monoid) : Prop :=
forall (M' : Monoid),
exists (F : (T -> type M') -> MonoidMorphism M M'),
exists (G : (type M -> type M') -> (T -> type M')),
(forall (f : T -> type M'), G (monmor M M' (F f)) = f)
/\ (forall (m : MonoidMorphism M M'), monmor M M' (F (G (monmor M M' m))) = monmor M M' m).
```

Now we need to build the free monoid. This is a classic result that I won’t bother proving : the free monoid over a set \(X\) is the set of finite sequences of elements of \(X\), and the operation is the concatenation. If you want more details see on wikipedia or on the nlab.

You can also find the formalised proof in the coq file, for the following definition of free monoid over a type :

```
Inductive FreeMonT (T : Type) : Type
:= App : T -> FreeMonT T -> FreeMonT T
| Empty : FreeMonT T
.
Fixpoint append {T : Type} (x y : FreeMonT T) : FreeMonT T :=
match x with
| Empty _ => y
| App _ h t => App T h (append t y)
end.
```

Let’s start a bit more seriously now. Let’s review our intuitions : we want need to add formal inverses, and reduce when something is multiplied by its formal inverse. Let’s note our base set \(X\). We define \(X^{-1}\) a set in bijection with \(X\), such that \(X\cup X^{-1} = \emptyset\). If \(a\in X\), we denote its image by the bijection \(a^{-1}\). Those will be our formal inverses.

Now let \(M\) be \(\M(X\uplus X^{-1})\) the free monoid over our initial set and the formal inverses. Here is how to implement something like that in coq :

```
Inductive WithInv (T : Type) : Type
:= Reg : T -> WithInv T
| ForInv : T -> WithInv T
.
```

Now we want a way to say that if, somewhere in the sequence, one find \(aa^{-1}\) or \(a^{-1}a\), they should be removed. But what does it means to *removes* a subsequence. It sounds more like we’re describing a procedure : start with a sequence, rewrite every reducible pair until there are no more. This sounds a bit like the \(\beta\)-reduction from lambda calculus. Let’s roll with this idea and see where it leads us.

We need to define the reduction first :

\[ \left\{\begin{array}{l} \forall x\in X, xx^{-1} \bet \epsilon \\ \forall x\in X, x^{-1}x \bet \epsilon \\ \forall \omega_1, \omega_2, \omega_2', \omega_3\in\M, \omega_2\bet\omega_2' \implies \omega_1\omega_2\omega_3 \bet \omega_1\omega_2'\omega_3 \\ \end{array}\right. \]

The way we’re going to define it in coq is equivalent, but will make it a bit easier to manipulate : given a word, either the redex is at the start, or we reduce the tail of the word. It gives the following definition :

```
Inductive Reduction (T : Type) : FreeMonT (WithInv T) -> FreeMonT (WithInv T) -> Prop
:= LeftRed : forall (x : T), forall (tl : FreeMonT (WithInv T)),
Reduction T (App (WithInv T) (ForInv T x) (App (WithInv T) (Reg T x) tl)) tl
| RightRed : forall (x : T), forall (tl : FreeMonT (WithInv T)),
Reduction T (App (WithInv T) (Reg T x) (App (WithInv T) (ForInv T x) tl)) tl
| CtxRed : forall (x : WithInv T), forall (m m' : FreeMonT (WithInv T)),
Reduction T m m' -> Reduction T (App (WithInv T) x m) (App (WithInv T) x m')
.
```

And now we would like to quotient \(\M(X\uplus X^{-1})\) by an equivalence relationship that says that two terms that have a common reduction are equivalent. This is actually easy because our reduction relation has two very interesting properties.

The first thing is that no matter in which order we do the reduction, we will always reach a normal form, that is to say a form that cannot be reduced further.

The intuitive argument is easy : everytime we reduce a word, we strictly reduce its length, so we must stop at some point.

Formally proving that in coq is a bit harder. First we need to define strong normalisation. Here we use the insight that if we define \(x \preceq y\) by \(y \rightarrow x\), we create a new relationship such that \(\rightarrow\) is strongly normalizing if and only if \(\preceq\) is well-founded. So we can just use the well founded module in Coq to get our definition :

```
Definition Inv (T : Type) (R : T -> T -> Prop) : T -> T -> Prop
:= fun (x y : T) => R y x.
Definition strongly_normalizing (T : Type) (R : T -> T -> Prop) : Prop
:= well_founded (Inv T R).
```

The proof is then trying to formalize the previous intuition. That is to say we first prove that if we have a relation preserving mapping from one relation to the other, and that the second one is well founded, then the first relation is well founded. Then we prove that `length`

is a relation preserving mapping to the natural with usual order. The fact that they are well founded is proved in the Coq standard library, so we use that to conclude.

```
Definition monotome_morphism (T T' : Type) (R : T -> T -> Prop)
(R' : T' -> T' -> Prop) (f : T -> T') : Prop :=
forall (x y : T), R x y -> R' (f x) (f y).
Theorem preimage_well_founded (T T' : Type) (R : T -> T -> Prop)
(R' : T' -> T' -> Prop) (f : T -> T') :
monotome_morphism T T' R R' f -> well_founded R' -> well_founded R.
Lemma monoidLength_monotone (T : Type) :
monotome_morphism (FreeMonT (WithInv T)) nat
(Inv (FreeMonT (WithInv T)) (Reduction T)) lt
(monoidLength (WithInv T)).
Theorem reduction_normalizing (T : Type) :
strongly_normalizing (FreeMonT (WithInv T)) (Reduction T).
```

Confluence is a sort of weakened determinism. It’s the idea that despite the fact that the reduction is not deterministic, if two terms come from the same original terms, we can reduce them further to reach a common reduction.

Our reduction system has an even stronger property, that is *strong confluence*. It means that if we have three terms \(\omega\), \(\omega_1\) and \(\omega_2\) such that \(\omega\rightarrow\omega_1\) and \(\omega\rightarrow\omega_2\), then either \(\omega_1 = \omega_2\) or there exists \(\omega'\) such that \(\omega_1\rightarrow\omega'\) and \(\omega_2\rightarrow\omega'\).

This property is actually captured by the following definition (the definition is actually a bit less strong because that’s the official definition, but in out specific case the two cases where \(\omega_1\rightarrow\omega_2\) or vice-versa never happen) :

```
Definition strongly_confluent (T : Type) (R : T -> T -> Prop) : Prop
:= forall (a b c : T), R a b -> R a c
-> (b = c) \/ (R b c) \/ (R c b) \/ (exists (d : T), R b d /\ R c d).
Theorem reduction_strongly_confluent' (T : Type) :
forall (a b c : (FreeMonT (WithInv T))), Reduction T a b -> Reduction T a c
-> (b = c) \/ (exists (d : (FreeMonT (WithInv T))), Reduction T b d /\ Reduction T c d).
Theorem reduction_strongly_confluent (T : Type) :
strongly_confluent (FreeMonT (WithInv T)) (Reduction T).
```

Those two properties, strong normalisation and confluence, means that to test if two terms have a common reduction, we can just reduce them both however we want until they are in normal form, and check the normal form for equality.

This is all well and good, but for now we’ve just defined the normal form as being a term that cannot be reduced further. Surely we can characterize it better ? And indeed, it is just a word such that there now sub word of the form \(xx^{-1}\) or \(x^{-1}x\).

We can just define that naïvely in Coq and it works (we call a word without sub words of the form \(xx^{-1}\) or \(x^{-1}x\) *stable*, and show it is equivalent to being normal for the reduction) :

```
Definition normal_form { T : Type } (R : T -> T -> Prop) (x : T) : Prop
:= forall (y : T), R x y -> False.
Inductive is_stable { T : Type } : FreeMonT (WithInv T) -> Prop
:= StableEmpty : is_stable (Empty (WithInv T))
| StableSing : forall (x : WithInv T), is_stable (App (WithInv T) x (Empty (WithInv T)))
| StableApp : forall (x y : WithInv T), forall (w : FreeMonT (WithInv T)),
inv x <> y -> is_stable (App (WithInv T) y w)
-> is_stable (App (WithInv T) x (App (WithInv T) y w)).
Theorem stable_is_normal_form { T : Type } :
forall (x : FreeMonT (WithInv T)), is_stable x <-> normal_form (Reduction T) x.
```

It turns out than under some reasonable assumption of the base type, our reduction also has a very interesting property : it is decidable. To show that we implement a coq function that fully reduce a term and prove it is correct. In order to do so we need the base type to have a decidable equality, which we would automatically have if we had assumed the excluded middle.

Here is the definition of the function, and the theorem proving it is correct :

```
Fixpoint liftEq { T : Type } (deceq : T -> T -> bool) (x y : WithInv T) : bool
:= match x, y with
| Reg _ a, Reg _ b => deceq a b
| ForInv _ a, ForInv _ b => deceq a b
| _, _ => false
end.
Fixpoint reduce { T : Type } (deceq : T -> T -> bool) (x : FreeMonT (WithInv T)) : FreeMonT (WithInv T) :=
let witheq : WithInv T -> WithInv T -> bool := liftEq deceq in
match x with
| (App _ a w) => let reduced : FreeMonT (WithInv T) := reduce deceq w in
match reduced with
| (App _ b w') => if (witheq (inv a) b)
then w'
else (App (WithInv T) a (App (WithInv T) b w'))
| Empty _ => App (WithInv T) a (Empty (WithInv T))
end
| Empty _ => Empty (WithInv T)
end.
Theorem reduce_is_unique_normal_form { T : Type } (deceq : T -> T -> bool) (decC : DecEqCorrect deceq) :
forall (x y : FreeMonT (WithInv T)),
normal_form_of (Reduction T) x y <-> y = reduce deceq x.
```

We can now create the candidate group by quotienting by the relation of having a common normal form. First, we need to check that the concatenation operation is compatible with this operation, which is obviously the case. The statement in coq is (where `trefl_closure`

is the reflexive transitive closure of the relation) :

```
Theorem reduction_compatible_append (T : Type) :
forall (a b c d : FreeMonT (WithInv T)),
trefl_closure (Reduction T) a c -> trefl_closure (Reduction T) b d
-> trefl_closure (Reduction T) (append a b) (append c d).
```

We would like to quotient in coq, but turns out that taking quotients in coq is hard. So instead we will cheat : we have a canonical representative of every quotient elements, the normal form. So we can define the type of elements that are in normal form, and that will be our quotient.

```
Record FreeGrpT (T : Type) : Type := mkFreeGrp {
elem : FreeMonT (WithInv T);
elemNormal : is_stable elem;
}.
```

But there is a problem with that approach. It’s that there can be more than one proof of stability for each element, leading to an equality on `FreeGrpT`

that is not just the equality of the element it stores. There are multiple way to fix that. The first one would be to assume axiom K. I don’t really like this solution since axiom K is incompatible with univalence. Another solution would be to make sure `is_stable`

is a mere proposition, but my knowledge of homotopy type theory is not good enough to make use of that yet. So instead I just added an axiom for that specific case.

```
Axiom eq_freegrp: forall (T : Type), forall (a b : FreeGrpT T), elem T a = elem T b -> a = b.
```

As we’ve previously mentioned, the free group can be characterized by a universal property. This is easy enough to define in Coq. The definitions are a bit long, so we refer you to the file `FreeGroup.v`

in the repo for the complete definitions and proofs.

The idea of the proof is to define a function from group morphism from the free group on \(X\) to a group \(G\) and the functions from \(X\) to \(G\) by composing the morphism \(\phi\) to the injection \(\iota\) of \(X\) in \(\G(X)\). This is injective because two morphisms that have the same image have the same image on the image of \(\iota\), and every element of \(\G(X)\) is generated from elements of the image of \(\iota\), so the morphisms are fully determined. To show it is surjective, one can take a function \(f : X \rightarrow G\) and extend it on \(\G(X)\) the naïve way: \(\tilde{f}(x_1^{\epsilon_1}\dots x_n^{\epsilon_n}) = f(x_1)^{\epsilon_1}\dots f(x_n)^{\epsilon_n}\). When restricted along \(\iota\), we get \(f\) again.

We have finally managed to construct the free group in Coq. The informal proof can be described in 2 pages of LaTeX, but the Coq proof took ~900 lines of code. This is probably because I’m still a beginner regarding Coq, but also because there where a few intuitive results I used without proving that are actually not that immediate to formally prove. All things considered it was a fun way to do some Coq again.

I also had to use an axiom, which is a bit disappointing. I’m planning to read more of homotopy type theory, and hopefully I’ll find a way to circumvent the need for that axiom.

March 31, 2017

GNU Mach is the microkernel developed for the Hurd system. It is a free (as in free speech) working microkernel. Despite its power, it is not very used and thus the documentation is lacking. This article aims to cover the principles of the mach kernel, and some of its API, to ease the development of any mach based system.

In order to test the code examples, you will need a working installation of a GNU mach system. The main one is the GNU Hurd operating system. You can refer to my previous article for an installation guide. Since the code we will be writing is very low level, it will be architecture dependant. Our target will be the x86 32-bits architecture, which is what you will have if you used my previous article to install Hurd.

A microkernel job is to provide an interface to the hardware. That means abstracting the CPU and handling the execution of programs on it. Mach uses two abstractions for this : tasks and threads. Execution units are threads : they are used to store an execution, ie the value of registers. Their execution order is decided by Mach, which handles the interleaving of their executions, and splitting over multiple cores, transparently.

There is another important device the kernel must provide an interface to, and that is memory. The solution used is that of virtual memory : each execution unit believes it has the whole memory for itself, and thus let the kernel handle where its memory really is, if it is swapped… But giving a whole memory map for every thread would be inefficient, as we might want to have few executions units sharing the same memory. The abstraction used for that is task : a task is a virtual memory map and a set of threads. A thread must always be part of a task, and has full access to the virtual memory of the task.

Tasks also provides some more convenience utilities. For example, killing or pausing a task kills/pause all of its threads. A few more utilities are provided to handle errors. We will go back to that later. The equivalent of tasks in a Linux system are processus (but these are more complexes : Mach’s tasks have no notion of users or permissions).

Now that we’ve seen how mach abstracts execution of programs, we need to look at the other essential aspects of a microkernel : the IPC. In other words, how to make tasks communicate. The abstraction is the port. A port is a queue of typed messages. It is manipulated through ports rights. A port right is an integer linked with a task that allows to enqueue and/or dequeue messages from the port (a bit like a file descriptor in Linux).

So that means that the knowledge of the port right is the ability to use it. It is made so that another task cannot guess port rights it does not have.

Ports rights are task specific. This means that all the threads of a task share the same port rights. When sending a message, a port right can be sent. This allow a ports to be created in a task, and then sent to other tasks.

They are three king of port rights : send, send-once and receive. The send right is the ability to enqueue messages on the port. It can be copied to other tasks. The send-once right is the ability to enqueue only one message on the port. The receive right is the ability to dequeue messages from the port. Only one task can hold the receive right for a given port. It can be used to create send and send-once rights for the port, and can be transfered between tasks.

Messages are dequeued in the order they are enqueued. Enqueueing a message will block if the port is full, and dequeueing will if it is empty. Both these operations can use a timeout. If multiple threads try to receive from a port at the same time, every message will be received only once, but which thread will receive which message is undefined.

Ports right can be grouped into ports sets. This allows to wait a message from multiple ports, reading from the first one to be ready (it is slightly similar to the `select`

system call in Linux).

It is possible to pause, resume and kill any thread independently, or whole tasks at once. When a thread is paused, it is possible access its state (ie the value of its registers) and change it (doing this in an unposed thread will cause undefined behaviours). This implies a thread cannot access its own state.

Every thread and task also has an exception port. When an error is raised (either manually, or because of a segfault/division by zero …), a message is sent to the thread exception port, or if it is null to the task exception port. This allows for debugging/recovering from errors.

The API documentation can be found here , but since it is quite terse, I will detail the ipc mechanism and the thread creation through examples.

A bit of warning : it seems the documentation is not completely up to date when it comes to return values. Indeed, after looking at the source code of Mach, it turns out some functions have more return values than what is described in this documentation. Just remember that all function returns a `kern_return_t`

, which is `KERN_SUCCESS`

in case of success, and another value in case of error. I won’t dig into error handling other than the existence of error, so it won’t bother me in the next examples. Also, and despite the fact that I use `perror`

, Mach calls does not set `errno`

.

Here I will detail how to create a port, send an integer to it and receive this integer from, all from the same thread. This is quite a cheap example, but the even basic IPC is not so trivial, so it may be interesting to focus on that first. You can find the code there. I will assume you’re looking at the code when reading the following, so I will not write the code again here.

First we need to create a port. As said before, a port is just an integer, but Mach defines an alias `mach_port_t`

for this. The call for the creation is `mach_port_allocate`

, documented here . It expects three argument. The third one is used to return the created port. The first one is the current task : indeed, we said before ports are task specific, and thus we allocate a port for a certain task. And there is call just for that : `mach_task_self()`

, which returns the current task. Finally, we precise which rights we want on the port. We will want to send and receive, but since we can derive send rights from receive ones, we will create it with receive rights, so we use the constant `MACH_PORT_RIGHT_RECEIVE`

.

Now we want to create utility functions to send and receive integers from a port. We will first focus on sending a message. Thus, what we want to create is a `send_integer`

function, which takes a port and an integer, and sends it on the port.

But we cannot send just anything on a port : message needs to have a certain formatting. It must starts with a `mach_msg_header_t`

, then information about its type (either short or long). We will use short information so we need a `mach_msg_type_t`

. And then is the content of the message, in our case an integer. Instead of creating buffers with the right size and copying structure into it, we can directly create our own structure with the right members.

When sending a message, it need to have both the right header and the right type info. First we focus on the header. The header must have data about the transfer itself, rather than the message specifically. It must have the size of the whole message (with the header and type information) in its `msgh_size`

field. The destination port is in `msgh_remote_port`

. A message can also specify another port in `msgh_local_port`

, which is often used as a reply port. In our case, we don’t need it, so we set it to `MACH_PORT_NULL`

. Finally, we need to precise which right these ports have in the `msgh_bits`

field. We need to use the macros `MACH_MSGH_BITS_REMOTE`

to say the given right are to apply to the remote port, and we use `MACH_MSG_TYPE_MAKE_SEND`

to tell it to create send right from the receive one (remember, we created the port with receive rights, because it is possible to create send one from these), and to use them to enqueue the message.

Then on to the type header. This header is used by receiver to determine to kind of data sent. Sending and receiving the message would still work without it, but it is necessary when sending more complex data (like port rights, or out of line data), so it is a good idea to learn how to set it. Most of the field are pretty straightforward, like `msgt_size`

or `msgt_number`

. `msgt_name`

is just a classifier of data, mostly unnecessary, unless you send port rights to another task, so that Mach can update them according to the new task. `msgt_inline`

tells whether the data is inside the message, or if their only is a pointer to the data in the message. This field is particularly important when you communicate with another task, which has another memory space, as Mach will interpret it and copy the data to the other task memory space, and update the pointer transparently. `msgt_deallocate`

, in the case of an out of line, tells Mach to free the data from the sender memory space. Finally, `msgt_longform`

specify whether the type header is the long one or none.

Now that we’ve got our well formatted header, we can proceed to send it. The magic function doing everything (which we will also use for reception) is the `mach_msg`

procedure. It’s first argument is the message itself (it says it expects a pointer to a header but since the message must start with a header it is the same to give any of the pointer). The second argument precise the kind of transfer. The third one is the size we want to send (here we set it to the size of the message). The forth the size we want to receive (since we’re only sending a message, we set it to 0). Then the port on which to receive (we’re not receiving, so we set it to `MACH_PORT_NULL`

). The timeout in case the queue is full (we won’t use that here). And finally, the last argument can be used for certain modes of communication, which I didn’t fully understand, so we’ll not use that and set it to `MACH_PORT_NULL`

.

Here we are, you can now almost fully understand the `send_integer`

procedure.

The `receive_integer`

one is much more easier, since the message will be fully set by the reception. Well, this is not completely true : `mach_msg`

will use the `msgh_size`

of the header to determine how much size there is in the message to fill. Please not this field may be changed according to the amount of data received.

We can now easily use these functions on our newly created port to send and receive our favorite integer !

The code for this project is there. I kept the messaging utility we’ve seen in the previous section to test our new thread. This code is not completely functional : for some reason, any hurd system call like `printf`

or `scanf`

fails in the new thread, so we’re limited to Mach IPC in it.

Here is the standard procedure to create a working thread in Hurd : we create the thread, which is paused without state initially. We then set the state and resume the thread. Creating the thread is pretty straightforward with the `thread_create`

function, and resuming it is as easy with the `thread_resume`

function. Setting the state is where the difficulty lie.

Indeed, a thread state is the value of its registers. This is where the code we’re writing becomes architecture specific. Thankfully, Mach has utility to describe the register values. First of all, there is the `i386_THREAD_STATE_COUNT`

which holds the number of registers to set, and the `struct i386_thread_state`

structure which allows us to easily set the value of the registers.

Now let’s focus on which registers we want to set. First of all, there is the `eip`

register which hold the pointer to the next instruction. This one is easy : since the new thread share the same memory space as the old one, we simply need to set it to the pointer to the function we want to execute. We then need to set the stack `uesp`

and frame `ebp`

pointer. It is sufficient to set `ebp`

to zero. But to set `uesp`

a question arise : where to we create our stack, and how.

To create the stack, we simply allocate a big enough block of memory. Let’s use this occasion to see some functionality of the virtual memory of Mach). I decided (more or less arbitrarily : it copies what Hurd cthread library does) to set the stack size to 16 pages. To find the page size, Mach defines a global variable `mach_page_size`

, which is of 4Kb in my 32-bit Qemu Hurd installation. The memory is allocated with `vm_allocate`

. The first argument is the process for which the memory must be allocated, the second one is used to return the pointer to the newly allocated memory, the third one the size we want (which may be rounded if it is not a multiple of a page size), and the forth one says if we care about where it is allocated (FALSE value, in which case it will try to allocated at the position specified by the second argument), or if it can allocate it anywhere.

Once we’ve got our chunk of memory, there remain two things to do : set the right values at the top so that the function see its argument, and protect to bottom of the stack to detect stack overflows. This second thing is not necessary but it allows us to use the virtual memory protection mechanism in Mach. Since permissions can only be set on whole pages, we will protect the bottom page of the stack from both reading and writing. The function to do so is `vm_protect`

. The first argument the process whose memory must be protected, the second the address of the memory, and the third the size (which may be rounded to a multiple of page size). According to my understanding, you can either set the permissions or the maximum permission allowed, and this is controlled by the forth argument, set to FALSE.

Now that we’re protected from stack overflows, we need to setup the stack to be able to call the function in the thread. In 32bits, all arguments are passed on the stack. The stack pointer of the function must point to the element in the stack holding the return address, and above it the arguments, the first one the closest to return value. More information on wikipedia. Once the stack is ready, we can set `uesp`

, and resume the thread.

We now have an utility function to start a thread. But what happen when the thread ends ? We set the return value to 0, so of course it will segfault, but there is no good position to return to, we simply want the thread to end. Well Mach provides the procedure `thread_terminate`

. The easiest way to circumvent the problem is to call it on `mach_thread_self()`

at the end of the routine. For those of you who feel this solution is not satisfying because intrusive, feel free to create a wrapper function to launch, taking in argument the pointer to the routine.

Now our system to start thread is complete. Awfully complicated, isn’t it. Well, Hurd provide a library, cthreads, allowing an easy use of threads on Mach. But that wouldn’t have been fun, wouldn’t it ?

We have seen, through minimal examples, how to use ipc and thread in mach. Here I will propose a small exercise to put it all together.

Kahn networks are a theoretical model of calculus : a set of deterministic units able to communicate. Here every unit will be a thread, communicating through ports. The Sieve of Eratosthenes is an algorithm to find prime numbers by filtrating out the multiples of every prime already found.

The idea of our implementation is the following : one thread writes into a port every integer starting from 2. The main thread, every time it receives an integer, print it, and add between itself and the previous thread a filter communicating only the integers received that are not multiples of the prime just found. At any moment, the number of threads in the task will be the number of primes found more 2.

You now have all the cards you need to make it work. Have fun ! The solution is here.

I hope to have given you enough knowledge to understand without to much trouble the documentation of Mach. Still, there are a number of points left unexplained :

- How to catch exception (segfaults, division by zero …) from the created threads.
- How to handle hardware devices.
- How to create a memory manager for Hurd.
- Perhaps more importantly, how to use MIG, the Mach Interface Generator, to ease the creation of communication through ports.
- How to use the Mach debugging features.
- And probably a lot more.

February 28, 2017

Hurd is an interesting piece of software, and digging in may be an interesting way of learning of the ins and outs of operating systems. But you can’t just download the source code, compile it and run it as you would do with most of the programs.

Here I talk about setting up an environment for working on the *user space* , ie not the micro kernel. In this post I will cover how to install Hurd in a virtual environment, how to use it, and how to set the whole thing up to ease the writing and testing of software on Hurd.

Hurd in itself is not an operating system, but only a kernel, and can’t be used on its own (like Linux). So what do we install. Well, like there are Linux distribution, there are Hurd distributions, even though less numerous :

Debian GNU/Hurd , a porting of the Debian GNU/Linux distribution over Hurd. It is the official distribution.

ArchHurd , a porting of the ArchLinux distribution.

NixHurd , although it is more experimental and can only run in Qemu.

Since we are only interested in working on Hurd itself, I will propose to install Debian GNU/Hurd, as it is the most stable of the three.

To ease the testing of the system, the installation will be done in a Qemu . To begin, you must install the `qemu`

package (or `qemu-kvm`

) on your system and enable KVM in your Linux Kernel, to speed up the simulation.

You must now setup Qemu. First of all, you must create a virtual disk on which Hurd will installed. The command to do so is :

`qemu-img create file.img size`

where `size`

is 3G for example (this capacity is indeed enough for what we will do).

Once this is done, you can download the iso with the following command :

`wget http://people.debian.org/~sthibault/hurd-i386/installer/cdimage/cd-1.iso`

Once this is done, you can launch the system with :

`qemu-kvm -m 1G -drive cache=writeback,file=file.img -cdrom cd-1.iso`

The virtual machine will then boot the live CD, and you will get a grub where you can choose the installation process. Go for the `Pseudo-graphical install`

which will give you a standard ncurses interface.

The installation process is well detailed : if you have ever installed a Linux distribution, it shouldn’t trouble you. Leaving the default option are fine in most cases (the only trouble I got was that it failed when asked to install a graphical environment, so I didn’t install one).

Once this is finished, shutdown the virtual machine.

For now, login as root to setup the environment. The user you created will be used later.

First of all, we will need to upgrade the installed software, but simply running `apt update`

failed for me. The reason was that it first tried to update from the cd. So the first step is to go edit the `/etc/apt/sources.list`

file and comment (or delete) the lines starting by `deb cdrom`

.

Once you’ve done this, run : `apt update && apt upgrade`

You can now use `apt install`

to install any software you might want. To turn the virtual machine off, first use the `shutdown -h 0`

command and close the qemu window once you see the `running in tight loop`

message.

The workflow advocated by the official page is to develop and compile directly in Hurd. The compiling part is necessary (unless you manage to setup a cross compilation toolchain from Linux to Hurd), by if like me you don’t want to have to setup your whole programming environment in a virtual machine, you may want to at least do the programming part on your host system.

The workflow I will present is thus the following : you work on your host system, send the files to the virtual machine, and compile and run there. Since Qemu does not support sharing a disk with the host, another solution must be found. You could use git to synchronize the files, but it would mean committing every change you want to test. Another solution would to setup a filesystem sharing server on your host and connect to it from Hurd. But Hurd doesn’t seem to have an sshfs implementation and despite having a nfs implementation, it seems to be quite experimental .

We will use the fact that Hurd comes with a working ssh server to use rsync. While qemu automatically setup networking so that the virtual environment has access to internet and Hurd autoconnects, we will need to pass a flag to qemu so that we can connect from the host to the guest. Remember that due to the specific nature of ping, you cannot ping from the guest OS. Thus ping failing does not mean you are not connected.

This flag is `-redir tcp:iph::ipg`

: it will redirect any connection on localhost port iph on the host to ipg port on the guest. For example, if you launch qemu with the command :

`qemu-kvm -m 1G -drive cache=writeback,file=file.img -redir tcp:2222::22<br>`

you can ssh into your hurd installation with : `ssh -p 2222 root@localhost<br>`

. This may be an easier to work in hurd rather than in qemu.

Since qemu 2.6 complain about `-redir`

begin deprecated, it may be a good idea to replace it. The new option is `-net nic -net user,hostfwd=tcp::2222-:22`

.

If you only want to work on you virtual machine through ssh, you may want to launch qemu without a graphic interface (for example if you want to launch qemu on a distant headless computer). To do that, append the `-nographic`

option to the previous command. If you want to keep the output on terminal, you can replace `-nographic`

by `-curses`

.

In order for rsync to work, in must also be installed on the guest. Thus run : `apt install rsync`

in Qemu. You can then synchronize any directory with :

`rsync -r --rsh='ssh -p 2222' dir user@localhost:/path/<br>`

You will get a shitload of errors due to operations not being implemented in Hurd’s version of ext2, but the file will still be synchronized.

All that is left is to install the right software in Hurd to build the translator and run them. Since Hurd allows any translator to run in user space, we will use the user created during the installation.

To build the source, you first need to install some tools and their dependencies. To install the tools, you must run the following command :

`apt install build-essential fakeroot git`

And for the dependencies : `apt build-dep hurd`

.

Then you must download the sources. They are managed with git, so all you have to do is to clone their repo. The links are given on this page. Once you have all the sources you want, you can copy them to Hurd using rsync as seen above.

To compile the code source, there are a few steps. First of all, you must generate the configure script. To do so, run `autoconf`

in the Hurd directory. Then create a build directory, move into it and call the configure script from it. Then all that’s left is to run `make`

. You can pass to make the name of a submodule to build only it.

You now have a working environment to hack Hurd and contribute. Have fun !