Building OSo (RTOS) Part 1: Introduction

In this post, I give a brief introduction to OSo, its high-level architecture, system capabilities, and RTOS fundamentals while outlining the next blog posts in the series. If you prefer to learn by doing, you can check OSo’s repository on GitHub.

Why Build an RTOS?

While working at John Deere, I started working on an Embedded Linux project with little OS (operating systems) background. This project represented one of the hardest challenges in my career because it was the first time I would develop such a complex system.

I was always curious as to how operating systems worked and how they were implemented, I tackled this by reading Operating Systems: Three Easy Pieces (OSTEP) by Remzi and Andrea Arpaci-Dusseau and all the pieces began to come together, but there was something missing. I’m an engineer and I learn things by getting my hands dirty. I decided to work hands-on in the Linux kernel version 5, which was complex for someone lacking the fundamentals. But things needed to be better; I had to learn the fundamentals and go from there.

I enrolled in the CS 652 Real-Time Programming (trains) class offered at the University of Waterloo, where the main goal is to design and implement a real-time operating system. I had to unenroll from the class for personal reasons, but the seed had been planted in me.

In October 2024, I started working on my real-time operating system, OSo, during my free time.

Overview of RTOS Fundamentals

A general-purpose operating system (OS) differs from a Real-Time Operating System (RTOS) mainly on its design goal. An OS is meant to have high data throughput to be interactive while an RTOS needs to meet its deadlines.

RTOS are designed to be used mostly on embedded systems, such as automotive electronic computing units (ECUs), for electric vehicle chargers, and other embedded applications.

In OS design, there are mainly two approaches, the microkernel and the monolithic kernel. In the microkernel, a core kernel has minimal functionality, and all the services are servers, this approach helps to improve fault tolerance in the overall system, e.g., if a server fails, the rest of the system still works. In the monolithic approach, if a subsystem fails, it can crash the whole system. Also, in the monolithic kernel, the inter-process communication is less convoluted. QNX is an example of a microkernel. The Linux kernel is monolithic, although some people argue it’s a “hybrid” kernel thanks to its modularity achieved via modules that can be hooked during runtime.

In real-time systems, we have different ways to categorize all the different “flavors” of real-time requirements. For a thorough explanation, check the Chapter 1 of the Real-Time Systems book by Liu. For our purpose, we only care about hard and soft deadlines.

Some of the characteristics that an RTOS needs to have are:

Determinism: Sources of non-deterministism in computing are caches, interrupts.
Scheduling: The scheduling algorithm needs to ensure that the deadlines are met.
Time-Sensitive Operations.

OSo Goals and Constraints

The main goal of this project is to create a small real-time operating system for “real-world” embedded systems. I started from scratch to get a better understanding of things that I might take for granted, such as booting, device interaction, memory management, timing, threading and concurrency.

The first version targets the ARM Cortex-A architecture to support the Raspberry Pi 4b+, which is an accessible development board. For later versions, I will refactor the hardware-specific code out of the main kernel logic to support other architectures, such as the modern RISC-V architecture.

OSo High-Level Architecture and Design

I followed the KISS (Keep it simple, stupid!) principle to the best of my capacities. OSo has a microkernel design.

I decided to use C as the programming language for OSo because I know it well. Another hot shot at the market as of the writing of this blog is Rust; I’m keen to learn this programming language since it’s gaining a lot of traction in the Linux kernel community and embedded software development, but I didn’t want to shoot myself in the foot. Building an operating system is a complex task in itself; trying to do it with a language that you don’t have full command of would be a recipe for disaster.

OSo provides an API to application programmers that helps them create and manipulate tasks, message-passing for inter-process communication, interrupt processing for the process to await for an event, name server interface, clock server interface and input/output server interface.

For version 1.0, I made the kernel internals the simplest, i.e., the mechanisms and policies that enable users to use system resources and run programs. I decided to set a fixed number of tasks to schedule functions; OSo has a static number of priorities, where higher priority tasks execute before lower priority ones, while functions with the same priority have a round-robin untie criterion.

Final Thoughts

It goes without saying in tech, but quoting one of the greatest scientists ever, “we stand on the shoulders of giants,” and this occasion is no exception. I tried my best to design and code myself for a better understanding of core concepts, but a lot of my work wouldn’t have been possible without the guidance of different materials of smart people who worked on this task before me, such as the CS 652 material, rpi4-osdev by Adam Greenwood-Byrne, and raspberry-pi-os by Sergey Matyukevich.

I plan to write a series of blog posts relating to my work on this project. I don’t want to make this a step-by-step tutorial on how to create your own OS, there are plenty of resources out there to do that (I recommend starting by looking at the raspberry-pi-os that I pointed out previously), instead, my main goal is to improve my technical writing abilities and use these blog posts to present the design and development of my work in a structured fashion, along with the challenges, and how I overcame them.

In the following blog posts for the series, I’m going to write about bootstrapping the kernel and running the first C code, a more detailed entry about kernel design, tasks and how to handle them inside our kernel, scheduling, inter-process communication, memory and resource management, and finally, the application layer, and real-world test cases.

Last updated: November 29, 2024