OCEOS/oceos introduction

From wiki
Jump to navigation Jump to search

Introduction

OCEOS is a pre-emptive real time operating system (RTOS) with a small memory footprint intended for hard real time systems that use the GR716 micro-controller, SPARC LEON2/3/4 processors and ARM Cortex-M micro-processors.

It was developed by O.C.E. Technology with support from the European Space Agency (ESA) under project 4000127901/19/NL/AS and 4000135473/21/NL/GLC/js.

This document describes the features and use of OCEOS, and details its behavior and system calls.

  • Real time software is often written as a set of trap/interrupt handlers and tasks managed by a RTOS.
  • The trap/interrupt handlers start due to anomalous conditions or external happenings. They carry out the immediately necessary processing, and may ask the RTOS start a task to complete the processing.
  • The RTOS starts tasks and schedules them for execution based on priority. In hard real time systems scheduling must ensure that each task completes no later than its deadline. Being early can also be a problem.
  • OCEOS provides a timed output service (software timers) that allows an output be set for a precise system time independently of scheduling.
  • OCEOS supports up to 255 tasks, and up to 15 current start requests for each task.
    • Each task has a fixed priority.
    • There are 254 task priorities, 1 (highest task priority) to 254 (lowest task priority).
    • More than one task may have the same priority.
    • Tasks of the same priority are FIFO scheduled.
    • Time slicing between tasks does not occur in OCEOS.
  • In OCEOS each task has a pre-emption threshold. A task can only be pre-empted by a task with a higher priority than this threshold. Pre-emptions and any traps/interrupts that occur will delay a task’s completion and potentially cause it to miss its deadline. Careful analysis is needed to ensure that task deadlines are always met.
  • OCEOS supports this analysis and allows relatively simple determination of worst case behaviour.
  • Problems such as unbounded priority inversion, chained blocking, and deadlocks cannot occur in OCEOS.
  • OCEOS provides mutual exclusion semaphores (mutexes) to protect critical shared code or data, and inter-task communication using counting semaphores and data queues.
  • A system state variable provides a summary of the current state of the system.
  • Error conditions such as missed deadlines are logged and the system state variable updated. If the system state is not normal a user defined error handling function may be called and actions taken such as disabling a task or resetting the system.
  • OCEOS does not allow dynamic creation of tasks at run time.
  • Virtual memory is not supported.
  • OCEOS is based on the Stack Resource Policy extension of the Priority Ceiling Protocol Baker 1991.
  • OCEOS is provided as a library and is statically linked with an application.
  • Services not needed by an application are omitted by the linker.

Oceos.png

Purpose of the Software

OCEOS is an object library for target architecture which provides directives to schedule fixed priority tasks, and supporting mutexes, semaphores, data queues, timed outputs, and interrupt handling.

There are also directives for logging, error handling, GR716 memory protection and GR716 watchdog control.

OCEOS Main Features

  • Fixed priority preemptive scheduling
  • Based on Stack Resource Policy – unbounded priority inversion, chained blocking, and deadlocks cannot occur
  • Deterministic processing
  • Fast real-time performance
  • Single stack rather than separate stack for each task
  • Small code footprint ( <10 kBytes for scheduling and mutex)
  • Inter-task communication (Counting Semaphores and Data Queues)
  • Synchronisation (Mutexes)
  • Task preemption level
  • Interrupt support (most of the directives can be called from IRQ handlers)
  • Very short time for disabling interrupts
  • Very low interrupt latency
  • Nested interrupts support
  • High precision timed actions (Software Timers) for data output and task start at pricise time
  • Supports SPARC V8 LEON2/3/4 and ARM Cortex-M single core targets
  • Fault Detection, Isolation and Recovety
  • GR716 Memory protection functionality
  • Enhancements for Cobham-Gaisler GR716 applications
  • Sample Applications for Quick start
  • Online Documentation
  • Tutorial applications to demonstrate main features of OCEOS
  • DMON debug tool support (execution timeline, CPU usage and profiling, commands to display information for OCEOS modules)
  • Support & ISVV services available from OCE Technology Ltd.
  • MISRA C Compliant
  • Developed to ECSS Category B and ISO 26262 standards
  • Developed in cooperation with European Space Agency (ESA)

OCEOS Main Concepts

An application that uses OCEOS typically starts at main(), performs device initialization and other activities including setting up interrupt handlers, then starts OCEOS in two phases, an initialization phase and a scheduling phase. Tasks, mutexes etc. are only created in the initialization phase, resulting in a fixed data structure that is passed to the scheduling phase. OCEOS normally does not return to the application but will do so if it detects a fatal error.

The specific application determines the tasks, mutexes and other OCEOS components used.

The interrupt/trap handlers respond to events, do the immediately necessary processing, and typically ask OCEOS to start a task to complete the processing.

OCEOS then creates an execution instance of the task, a ‘job’, and either puts it into immediate execution or on the ready queue or the timed jobs queue waiting to be scheduled for execution.

OCEOS schedules jobs based on their priority and may pre-empt the processor from a job with a lower pre-emption threshold before that job completes in order to start a higher priority job.

OCEOS maintains a system priority ceiling based on currently locked resources and the pre-emption threshold of the currently executing job. Only jobs with higher priority than this ceiling can pre-empt.

Jobs with the same priority are scheduled according to time of arrival on the ready queue. Time slicing or pre-emption between jobs of the same priority does not take place.

A job may have a deadline by which it must complete. It is the responsibility of the software developer to assign task priorities so that all jobs are guaranteed to meet their deadlines.

In OCEOS mutual exclusion of tasks is supported by mutexes. Counting semaphores and data queues support task synchronization and the exchange of data between jobs.

Task start requests and data outputs (software timers) can be set to occur independently of scheduling at a given system time in microseconds.

OCEOS maintains a system state variable and system log. If the state variable is not normal OCEOS may call a user defined function to deal with whatever condition has occurred.

Tasks

In OCEOS all tasks are defined before scheduling begins and each task assigned a fixed priority, a pre-emption threshold, a current jobs limit, a maximum start to completion time, and an expected minimum inter start request time. A task’s pre-emption threshold limits the tasks that can pre-empt it to those with higher priority than this threshold. Tasks with short run times can use this to avoid context switch overheads.

Each task has two main states, enabled or disabled. OCEOS provides system calls to change task state. The state is usually enabled when the task is created, but can be initially disabled.
A disabled task will not be executed, and any attempt to start it will be logged and the system state variable updated.

A request to start an enabled task creates an execution instance of the task, a ‘job’. A count is kept for each task of the number of times it has been started, i.e. total number of jobs created.
Multiple jobs of the same task can be in existence simultaneously. A limit for the maximum number of these current jobs is set when a task is created, the ‘current jobs limit’.
Each task has a current jobs count. This increments each time a task is started and decrements on job completion unless completion creates a further job from this task.
A further execution instance of a task is not created if the current jobs count has reached the current jobs limit. Such job creation failures are logged and the system state updated.
For each task a record is kept of the shortest time between attempts to create a new job, and a record of the maximum time between these attempts.
For each task a record is kept also of the maximum time a job was waiting before starting, the minimum execution time, the maximum execution time, the maximum time from job creation to completion, and the maximum number of job pre-emptions. The system state variable is updated if necessary.

Task chaining is supported. A job can create a new job and pass this to the scheduler or place it on the timed actions queue.

Tasks provide a start address for a termination routine for use if task execution must be abandoned.

Jobs

The request to start a task initializes a 'job', an execution instance of the task. A parameter is provided to allow an optional structure pointer be passed to the job.
A job can require varying times to complete, with one exception its maximum execution time and maximum resource requirements are finite and known at compile time.
This one exception is the unique lowest priority ‘idle job’, which if it exists has no execution time limitation. If there is no idle task OCEOS will place the CPU in sleep mode.
Typically the ‘idle job’ is not idle, but used for system monitoring, logging, and initiating system correction activities. To save power it may put the CPU in sleep mode waiting for interrupts.

A job has two principal states, pending and active. A pending job has not started execution and is either on the ready queue or on the pending jobs queue of a semaphore or data queue, or on the timed actions queue. Only pending jobs wait on resources. In OCEOS a pending job becomes active when it first starts execution. An active job is either running on the CPU or on the ready queue after being pre-empted.
An active job may have to wait on the ready queue for the CPU but never has to wait for any other resource and is never on a pending jobs queue.

For each job a record is kept of the time it was created, the time it became active, the execution time, the time it completed, and the number of times it was pre-empted.

When a job completes, these job records are used to update the task records.

Data can be passed between jobs of the same or different tasks by using statically allocated variables, counting semaphores, or data queues.

A job may create another job and place this on the ready queue or on the timed actions queue.

If job can not continue because semaphore count is zero or data queue is empty, the job is put on pending queue for that semaphore or data queue and restarted when semaphore count is incremented or data is written to data queue. The user must take care to save information in data section, so the restarted job could continue without loss of already acquired data.

Ready Queue

This priority queue contains pending and active jobs ordered highest priority first, and within priority earliest arrival first.
The scheduler places the first job on this queue into execution if and only if its priority is higher than the system priority ceiling. When it does so it updates the system priority ceiling to the pre-emption threshold of this job.

OCEOS places pending jobs on this queue due to an interrupt or trap, or as a result of a system call from the current job, a semaphore signal or queue write, or a timeout.
Each job in the Ready Queue is tagged with an ‘origin’ giving how the job came to be placed on the Ready Queue including time and identification of any other queue from which it was removed.
There can be multiple occurrences of jobs from the same task on the Ready Queue, each with its origin. Initially this queue either contains the Idle Job, which must have the lowest priority, or is empty.

Timed Actions Queue

This is a priority queue of pending actions (software timers) and their associated times, with queue priority based on these times.
It is linked to a high priority hardware timer which is set to interrupt at the time of the first action on the priority queue.
When this interrupt occurs actions on the queue within a timing tolerance of the current time are carried out and the timer is reset to interrupt at the start time of the earliest remaining action.
If the action is a task start request then at the set time the associated job is transferred to the ready queue, its origin set to timeout, and it is also removed from the pending list of any counting semaphore or data queue on which it is present. If the job is now the highest priority job on the ready queue it is started immediately, otherwise it must wait until that is the case.
If the action is a data output then at the set time it is performed immediately.

System Priority Ceiling

The value of this integer is the maximum priority of the priority ceilings of currently locked mutexes and the pre-emption threshold of the currently executing job. It is not directly accessible to an application.

Scheduling

OCEOS schedules jobs, i.e. execution instances of tasks created as a result of task start requests.
The scheduler places a job into execution if and only if its priority is higher than the system priority ceiling. When it does so it updates the system priority ceiling to the pre-emption threshold of this job.
Scheduling is pre-emptive and based on priority. The scheduler may pre-empt the processor from a lower priority job before that job completes to allow a higher priority job be started.

Jobs with the same priority are scheduled based on the order of their arrival on the ready queue. Time slicing or pre-emption between jobs of the same priority does not occur.
A job with a very short run time can avoid context switch overheads by having a high pre-emption threshold. This also allows pre-emption by some or all higher priority tasks be prevented. Such jobs with high pre-emption thresholds are scheduled based on their priority as usual.
In OCEOS a wait for a mutex causes the system priority ceiling to be raised to the priority ceiling of the mutex unless it was already higher. The system priority ceiling returns to its previous value when the mutex is signaled.

As a result the scheduler will not start any task that waits on that mutex until the mutex has been returned, as the mutex ceiling is the priority of the highest priority task that uses the mutex.
(Instead of a higher priority job pre-empting a lower priority job but before termination returning control to a lower priority job and waiting for it to return a mutex, the wait for the mutex is done first.)

In OCEOS context switching between tasks is minimized. A lower priority job only resumes execution when a higher priority job has completed, thus allowing stack sharing.
In OCEOS problems such as priority inversion, chained blocking, and deadlocks cannot occur, all tasks can share the same stack thus saving memory, and schedulability analysis is simplified.

Unnecessary blocking will occur for a job that does not use a mutex if it has a priority lower than the mutex priority ceiling and the mutex is currently held by a lower priority job.
This un-necessary blocking may seldom occur, is limited in duration, and can be avoided by setting task priorities appropriately.

This scheduling approach is based on the Stack Resource Policy extension of the Priority Ceiling Protocol Baker 1991.

Shared Resources

Many data structures cannot be fully accessed with a single instruction. Errors can result if a job using such a structure is pre-empted by another job that also uses the structure.
To address this OCEOS provides up to 63 mutual exclusion semaphores (‘mutexes’). Each shared data structure or critical code segment typically is associated with its own mutex.
Any job that uses a shared item should first acquire its mutex from OCEOS, and when finished with the item return the mutex. The number of instructions for which the mutex is held must be finite.
OCEOS allows a mutex be held by only one job at any one time. A job that uses a shared data structure or critical code segment can use an associated mutex to exclude all other jobs.
N.B. An OS does not prevent a shared resource being accessed if no attempt is made to acquire its mutex. The software developer must ensure this is done before the shared resource is used.

Inter-Job Communication

OCEOS provides two types of inter-job communication, counting semaphores and data queues. Semaphores and data queues are handled by OCEOS in a way different to many other OS.
In most OS a job that waits on a zero semaphore or tries to read an empty data queue may be blocked at that point and wait there indefinitely or with an optional timeout.
In OCEOS an active job may be pre-empted but cannot otherwise be blocked, so three options are provided in case a counting semaphore is zero when waited on or a data queue is empty when read.

One option results in a value always being returned and the job continuing. The returned value will indicate that the semaphore was zero or queue empty, and the job can take this into account.
In the other options, if the semaphore is zero or queue empty the job terminates and will restart from its beginning when the semaphore is signaled or queue written, or after an optional timeout.

When the job becomes active again after a timeout and again encounters the second option the behavior is as with the first option, the job continues and takes into account that a timeout has occurred.

Because a job restarts when a resource becomes available rather than waiting at the point where it sought the resource any local data developed up to that point will be lost unless stored non-locally for use after the restart. If required this can be done using static variables or structures, or using data queues or counting semaphores.

In OCEOS event communication between tasks or between tasks and interrupt handlers is done typically by using counting semaphores.

Pending Jobs Queue

Each counting semaphore and data queue has an associated queue of pending jobs that are waiting for that semaphore or queue. Pending jobs also wait on the timed actions queue.
Pending jobs queues contain only pending jobs and not active jobs, i.e. only jobs waiting to start execution.
A pending job can be on a semaphore pending jobs queue or on a data queue pending jobs queue and also on the timed actions queue.

The amount of time a job can spend on a semaphore or data queue pending jobs queue is of indefinite duration unless it is also on the timed actions queue.

When a job is removed from a semaphore or data queue pending jobs queue its origin is set accordingly. It is also removed from the timed actions queue if present.

When a timeout occurs and a job is removed from the timed actions queue its origin is set accordingly and it also is removed from a semaphore or data queue pending jobs queue if present.

Mutexes do not having pending job queues. A pending job waiting for a mutex waits on the ready queue. Such waits are expected to be of short duration.

If it is necessary to pass data to a later job a job can do this before termination using static variables, static structures, data queues or counting semaphores.

All jobs on a pending jobs queue are moved to the ready queue in the order of their arrival on the pending jobs queue before any of them is scheduled. Usually only one job is waiting on a pending jobs queue, if more than one a lower priority job may have to return to the pending jobs queue because a higher priority job or an earlier job has consumed the item.