The Journal of Supercomputing

http://link.springer.com/journal/11227

List of Papers (Total 134)

Validation of virtualization platforms for I-IoT purposes

Virtualization deployment in I-IoT domain is associated with many potential benefits. However, to achieve benefits from virtualization, which is nowadays a well-known technology, it is necessary to verify usability of virtualization platforms in context of I-IoT. In this article, we present a quantitative comparison of two leading open-source hypervisors, XEN and KVM, focusing on...

Real-time tsunami inundation forecast system for tsunami disaster prevention and mitigation

The tsunami disasters that occurred in Indonesia, Chile, and Japan have inflicted serious casualties and damaged social infrastructures. Tsunami forecasting systems are thus urgently required worldwide. We have developed a real-time tsunami inundation forecast system that can complete a tsunami inundation and damage forecast for coastal cities at the level of 10-m grid size in...

Improving all-reduce collective operations for imbalanced process arrival patterns

Two new algorithms for the all-reduce operation optimized for imbalanced process arrival patterns (PAPs) are presented: (1) sorted linear tree, (2) pre-reduced ring as well as a new way of online PAP detection, including process arrival time estimations, and their distribution between cooperating processes was introduced. The idea, pseudo-code, implementation details, benchmark...

Incentive-aware virtual machine scheduling in cloud computing

As cloud computing is a market-oriented utility, optimal virtual machine (VM) scheduling in cloud computing should take into account the incentives for both cloud users and the cloud provider. However, most of existing studies on VM scheduling only consider the incentive for one party, i.e., either the cloud users or the cloud provider. Very few related studies consider the...

Security threats to critical infrastructure: the human factor

In the twenty-first century, globalisation made corporate boundaries invisible and difficult to manage. This new macroeconomic transformation caused by globalisation introduced new challenges for critical infrastructure management. By replacing manual tasks with automated decision making and sophisticated technology, no doubt we feel much more secure than half a century ago. As...

Scalability of a multi-physics system for forest fire spread prediction in multi-core platforms

Advances in high-performance computing have led to an improvement in modeling multi-physics systems because of the capacity to solve complex numerical systems in a reasonable time. WRF–SFIRE is a multi-physics system that couples the atmospheric model WRF and the forest fire spread model called SFIRE with the objective of considering the atmosphere–fire interactions. In systems...

Parallelization of stochastic bounds for Markov chains on multicore and manycore platforms

The author demonstrates the methodology for parallelizing of finding stochastic bounds for Markov chains on multicore and manycore platforms. The stochastic bounds algorithm for Markov chains with the sparse matrices is investigated, thus needing a lot of irregular memory access. Its parallel implementations should scale across multiple threads and characterize with a high...

A taxonomy of task-based parallel programming technologies for high-performance computing

Task-based programming models for shared memory—such as Cilk Plus and OpenMP 3—are well established and documented. However, with the increase in parallel, many-core, and heterogeneous systems, a number of research-driven projects have developed more diversified task-based support, employing various programming and runtime features. Unfortunately, despite the fact that dozens of...

Language-based vectorization and parallelization using intrinsics, OpenMP, TBB and Cilk Plus

The aim of this paper is to evaluate OpenMP, TBB and Cilk Plus as basic language-based tools for simple and efficient parallelization of recursively defined computational problems and other problems that need both task and data parallelization techniques. We show how to use these models of parallel programming to transform a source code of Adaptive Simpson’s Integration to...

Actor model of Anemone functional language

This paper describes actor system of a new functional language called Anemone and compares it with actor systems of Scala and Erlang. Implementation details of the actor system are described. Performance evaluation is provided on sequential and concurrent programs.

Strategy for data-flow synchronizations in stencil parallel computations on multi-/manycore systems

In this paper, an innovative strategy for the data-flow synchronization in shared-memory systems is proposed. This strategy assumes to synchronize only interdependent threads instead of using the barrier approach that—in contrast to our approach—synchronize all threads. We demonstrate the adaptation of the data-flow synchronization strategy to two complex scientific applications...

The influence of datacenter usage on symmetry in datacenter network design

We undertake the first formal analysis of the role of symmetry, interpreted broadly, in the design of server-centric datacenter networks. Although symmetry has been mentioned by other researchers, we explicitly relate it to various specific, structural, graph-theoretic properties of datacenter networks. Our analysis of symmetry is motivated by the need to ascertain the usefulness...

Design of an accurate and high-speed binocular pupil tracking system based on GPGPUs

An efficient and robust pupil tracking system is an important tool in visual optics and ophthalmology. It is also central to techniques for gaze tracking, of use in psychological and medical research, marketing, human–computer interaction, virtual reality and other areas. A typical setup for pupil tracking includes a camera linked to infrared LED illumination. In this work, we...

Vectorized algorithm for multidimensional Monte Carlo integration on modern GPU, CPU and MIC architectures

The aim of this paper is to show that the multidimensional Monte Carlo integration can be efficiently implemented on computers with modern multicore CPUs and manycore accelerators including Intel MIC and GPU architectures using a new vectorized version of LCG pseudorandom number generator which requires limited amount of memory. We introduce two new implementations of the...

Parallelization of large vector similarity computations in a hybrid CPU+GPU environment

The paper presents design, implementation and tuning of a hybrid parallel OpenMP+CUDA code for computation of similarity between pairs of a large number of multidimensional vectors. The problem has a wide range of applications, and consequently its optimization is of high importance, especially on currently widespread hybrid CPU+GPU systems targeted in the paper. The following...

On energy consumption of switch-centric data center networks

Data center network (DCN) is the core of cloud computing and accounts for 40% energy spend when compared to cooling system, power distribution and conversion of the whole data center (DC) facility. It is essential to reduce the energy consumption of DCN to ensure energy-efficient (green) data center can be achieved. An analysis of DC performance and efficiency emphasizing the...