TypeTaxonScript: sugarifying and enhancing data structures in biological systematics and biodiversity research
Biology Methods and Protocols, 2024, bpae017
https://doi.org/10.1093/biomethods/bpae017
Advance Access Publication Date: 14 March 2024
Methods Article
TypeTaxonScript: sugarifying and enhancing data
structures in biological systematics and
biodiversity research
3
,
1
~ o da Flora—CNCFlora, Instituto de Pesquisas Jardim Bota
^ nico do Rio de Janeiro, Rio de Janeiro, 22460-030, Brazil
Centro Nacional de Conservaça
^ nico do Rio de Janeiro, Rio de Janeiro, 22460-030, Brazil
Diretoria de Pesquisa Cient�ıfica—DIPEQ, Instituto de Pesquisas Jardim Bota
3
�ticos e Biotecnologia, Parque Estaça
~ o Biolo
� gica–PqEB, Bras�ılia, 70770-901, Brazil
Embrapa Recursos Gene
2
�Correspondence address. Centro Nacional de Conservaça
~ o da Flora—CNCFlora, Instituto de Pesquisas Jardim Bota
^ nico do Rio de Janeiro, Rio de Janeiro,
22460-030, Brazil. E-mail:
Abstract
Object-oriented programming (OOP) embodies a software development paradigm grounded in representing real-world entities as objects,
facilitating a more efficient and structured modelling approach. In this article, we explore the synergy between OOP principles and the
TypeScript (TS) programming language to create a JSON-formatted database designed for storing arrays of biological features. This fusion
of technologies fosters a controlled and modular code script, streamlining the integration, manipulation, expansion, and analysis of
biological data, all while enhancing syntax for improved human readability, such as through the use of dot notation. We advocate for
biologists to embrace Git technology, akin to the practices of programmers and coders, for initiating versioned and collaborative projects.
Leveraging the widely accessible and acclaimed IDE, Visual Studio Code, provides an additional advantage. Not only does it support
running a Node.js environment, which is essential for running TS, but it also efficiently manages GitHub versioning. We provide a use
case involving taxonomic data structure, focusing on angiosperm legume plants. This method is characterized by its simplicity, as the
tools employed are both fully accessible and free of charge, and it is widely adopted by communities of professional programmers.
Moreover, we are dedicated to facilitating practical implementation and comprehension through a comprehensive tutorial, a readily
available pre-built database at GitHub, and a new package at npm.
Keywords: JavaScript; TypeScript; JSON; Mimosa; Node.js; taxonomy; morphology; Leguminosae; Fabaceae; Visual Studio Code; plant
Introduction
The endeavour to describe and catalogue organisms spans gener
ations, contributing significantly to the foundations of biological
knowledge and classification. Rooted in historical scientific liter
ature, the practice of representing organisms through textual
descriptions acts as a bridge connecting past and present scien
tific communities [1, 2]. As the digital age dawns, traditional
methods merge with contemporary technology [2, 3].
In the present day, taxonomists and systematists often resort to
familiar text editors, like Microsoft (MS) Word, to meticulously craft
their descriptions. While some practitioners venture into spread
sheets for structured data [4], rapid technological advancements
unveil new avenues for documentation and data organization.
Amidst this evolving landscape, untapped potential arises
through cutting-edge methodologies. While digital tools have sig
nificantly streamlined numerous research tasks, a notable gap
persists between these contemporary solutions and their wide
spread acceptance within the scientific community. In this con
text, our exploration delves into the symbiotic relationship
between object-oriented programming (OOP) and documentoriented databases (DOD). Through this lens, we foresee a
paradigm shift propelling biodiversity research into an era of effi
ciency, collaboration, and innovation.
TypeScript (TS), an extension of JavaScript (JS), is a robust
choice for intricate and organized systems [5]. Combining OOP
principles with TS creates a powerful development ecosystem,
facilitating the building of a JS Object Notation (JSON) database.
Moreover, it permits to incorporate multiple layers of data vali
dation to establish a highly reliable database.
The JSON format emerges as crucial within DOD, standing out
for data structuring [6]. Diverging from spreadsheets, JSON’s versa
tility and hierarchy accommodate varied data types, ideal for hous
ing diverse biological data and annotations. This aligns with
complex domains like systematics, chemistry, ecology, reproduc
tion, genomics, and proteomics, often better represented through
nested hierarchies.
In parallel, another approach for biological data management
involves ontologies like Gene Ontology [7, 8] and Plant Ontology
[9, 10]. These structured vocabularies connect biological concepts
intricately. Yet, not only their complexity, but also their repre
sentation demands specialized expertise, making JSON simplic
ity’s appealing to a wider community.
Received: 28 December 2023. Revised: 19 February 2024. Editorial decision: 26 February 2024. Accepted: 12 March 2024
# The Author(s) 2024. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/
by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial reuse, please contact
1,�
� Barreto Jorda
~o
� Fernando A. Baumgratz2, Marcelo Fragomeni Simon
, Marli Pires Morim2, Jose
Lucas Sa
� L. C. Eppinghaus1 and Vicente A. Calfo1
Andre
2 |
~o et al.
Jorda
Background
JS and Node.js environment
JS [12] serves as the foundation for a wide range of modern pro
gramming endeavours, including the management of biological
data [13–15]. Its versatility and widespread adoption have cata
lysed the development of tools and platforms that harness its
capabilities.
Node.js, a runtime environment built on Chrome’s V8 JS engine,
extends the potential of JS beyond the confines of web browsers
[16]. It enables the execution of JS code outside of browsers, facili
tating server-side scripting. This is particularly advantageous for
tasks involving data processing, handling API requests, and manag
ing databases [15]. Moreover, Node.js offers access to a wide array
of libraries and packages, expediting the development of databases
while enhancing its overall functionality.
JS and Node.js are powerful tools in the field of biological data
management. Their capabilities contribute to the development of
efficient, dynamic, and scalable databases, facilitating advance
ments in biodiversity research.
OOP and TS
In the ever-evolving realm of biological data management, the
fusion of OOP principles with TS, a powerful programming lan
guage extension, marks a significant leap forward. OOP, a para
digm in software development, revol (...truncated)