Frappe: fast fiducial detection on low cost hardware
Journal of Real-Time Image Processing
(2023) 20:119
https://doi.org/10.1007/s11554-023-01373-w
RESEARCH
Frappe: fast fiducial detection on low cost hardware
Simon Jones1
· Sabine Hauert1
Received: 26 July 2023 / Accepted: 27 September 2023
© The Author(s) 2023
Abstract
Square fiducial markers are widely used in robotics to easily obtain pose and other information about the world from camera
images. Processing the images to extract the markers is usually performed centrally with standard libraries but the code is
typically aimed at PC-level hardware. Platforms with constrained processing power have difficulty handling multiple camera
streams at real-time refresh rates. We introduce the Frappe (Fiducial Recognition Accelerated with Parallel Processing Elements) algorithm for detecting and decoding the popular ArUco tags. Designed to be implemented on the low cost hardware
of the Raspberry Pi Zero, we show tag detection and decoding on images of 640 × 480 resolution exceeding 60 Hz, five times
faster than the standard ArUco library, while maintaining similar detection performance and using much less energy. Using
Frappe, we demonstrate improved real-world performance on a visual navigation task with our DOTS robot.
Keywords Image processing · Fiducial tags · Robot vision · Embedded processing · GPU acceleration
1 Introduction
Scaling up robot numbers in real-world environments
requires both lowering the cost of robots, and improving
their ability to perceive and interact with the world. One
approach uses cheap vision hardware and augments the environment with markers. Square fiducial markers consisting
of a grid with a binary pattern are widely used in robotics
vision systems as a way of providing pose and navigation
information from a camera image feed without the complexity and processing cost of full image comprehension techniques such as Visual SLAM. The popular ArUco library
is widely used, but the processing cost is still significant in
resource constrained robot systems, limiting the resolution
and update rate that is possible, hindering the performance
of real-time robot navigation.
The Raspberry Pi series of educational Single Board
Computers (SBCs) has enabled many projects needing
a small, cheap computer running Linux. Well supported,
they have a camera interface supporting several models of
* Simon Jones
Sabine Hauert
1
Department of Engineering Mathematics, University
of Bristol, Bristol, UK
camera. A Raspberry Pi Zero and OV5241 camera module
can be purchased for around £16, providing 1080p60 streaming video. What is not widely utilised is the surprisingly
capable Graphics Processing Unit (GPU) that all Pi models
have, with around 24 GFLOPs processing power.
We design an image processing algorithm, called Frappe,
Fiducial Recognition Accelerated with Parallel Processing Elements, to use the Raspberry Pi (RPi) Zero GPU
for as much processing as possible. As proof-of-concept,
we implement Frappe on our swarm of DOTS [1] robots
designed for intralogistics applications. By re-engineering
the visual navigation system of the DOTS, enabling higher
detection frame-rates and resolutions than were previously
possible, we enhance performance at a visual navigation
task.
We make available an implementation of the algorithm
and a complete Docker-based development environment1.
This brings together the required specialised toolchains and
provides a virtual environment for compiling GPU applications targeting the Raspberry Pi Zero. We provide this
framework for others to make use of this underutilised processing power for visual processing and other edge processing applications.
This paper is organised as follows; Section 2 covers background and related material, Sect. 3 details the algorithm
and its implementation, Sect. 4 compares the performance
1
https://bitbucket.org/simonj23/frappe/src/master/.
13
Vol.:(0123456789)
119
Page 2 of 13
Fig. 1 Example of an ArUco
fiducial marker from the
ARUCO_MIP_36h12 dictionary, showing the full 8x8 region,
with the outer cells always
black, and the inner 6x6 36 bit
data payload
Journal of Real-Time Image Processing
(2023) 20:119
8x8 fiducial region
Fig. 2 Raspberry Pi Zero with attached camera, costing around £16
and capable of streaming up to 1080p60 video
6x6 data payload
of Frappe and ArUco on Raspberry Pi Zero hardware, before
using Frappe in a larger robot system for enhanced performance, and Sect. 5 concludes the paper.
2 Background
Fiducial markers or tags are visually distinct objects placed
in the environment to convey information or position or
both. In robotics, what is often desired is to extract pose
and position from a camera feed, in this case the fiducial
must convey both accurate position and unique identity. The
most common form is a monochrome square region with
an internal bit pattern, an early system was ARToolKit [2],
widely used examples include AprilTag [3, 4], ARTag [5],
and ArUco, with [6] showing generation of dictionaries
with near-optimal intermarker distance, and [7] accelerating detection. Circular forms are also common, such as
InterSense [8], STag [9], and CCTag [10]. CCTag is also
designed to be resistant to occlusion and motion blur. For
widely used square tags such as ArUco, AprilTag, and
ARTag, there has been work on blur resistant decoders with
conventional [11] and machine learning approaches [12,
13]. See [14] for a recent review and examination of the
comparative detection performance and resilience of some
different tag systems. Although not directly comparable with
our results, they show detection rates of 95% for ArUco in
their test data. They don’t directly report processing time,
but do say that 640x480 detection at 20 Hz on a Raspberry
Pi 3 was possible for ARTag and ArUco, but AprilTag was
too computationally intensive. Regarding the speed of various detectors, [4] report AprilTag2 at 78 ms for a 640x480
image on an Intel Xeon E5-2640, [7] report ArUco at 0.9 ms
for 640x480 on an Intel Core i7-4700HQ.
This work specifically addresses accelerating ArUco
tag detection on low cost hardware, due to our existing systems and software using this tag. Figure 1 shows
an example ArUco fiducial from the standard dictionary
ARUCO_MIP_36h12, generated as described in [6]. It
13
Fig. 3 DOTS robot, fast moving and low cost with 360◦ vision, enabling research into swarm intralogistics
shows the 8x8 region of a marker, consisting of an outer
perimeter of always black cells, with an inner 6x6 region
containing the data payload. Each of the 250 unique symbols
in the dictionary have a minimum Hamming distance of 12
from all other symbols, meaning that up to 6 erroneous bits
out of the 36 can be corrected (Fig. 2).
Our DOTS swarm robots [1], shown in Fig. 3, are
designed to enable research into swarm intralogistics. They
are low cost, capable of fast agile movement, able to carry
loads, and have a ROS2-based control system running on
RockPi 4 SBC. 250 mm in diameter, they are (...truncated)