# Behavior Research Methods

## List of Papers (Total 5,402)

#### Gaze tracking accuracy in humans: One eye is sometimes better than two

Most modern video eye trackers deliver binocular data. Many researchers take the average of the left and right eye signals (the version signal) to decrease the variable error (precision) up to a factor of $$\sqrt {2}$$. What happens to the systematic error (accuracy) if the left and right eye signals are averaged? To determine the systematic error, we conducted a calibration...

#### Moderation analysis in two-instance repeated measures designs: Probing methods and multiple moderator models

Moderation hypotheses appear in every area of psychological science, but the methods for testing and probing moderation in two-instance repeated measures designs are incomplete. This article begins with a short overview of testing and probing interactions in between-participant designs. Next I review the methods outlined in Judd, McClelland, and Smith (Psychological Methods 1...

#### Fast and slow errors: Logistic regression to identify patterns in accuracy–response time relationships

Understanding error and response time patterns is essential for making inferences in several domains of cognitive psychology. Crucial insights on cognitive performance and typical behavioral patterns are disclosed by using distributional analyses such as conditional accuracy functions (CAFs) instead of mean statistics. Several common behavioral error patterns revealed by CAFs are...

#### Toward the markerless and automatic analysis of kinematic features: A toolkit for gesture and movement research

Action, gesture, and sign represent unique aspects of human communication that use form and movement to convey meaning. Researchers typically use manual coding of video data to characterize naturalistic, meaningful movements at various levels of description, but the availability of markerless motion-tracking technology allows for quantification of the kinematic features of...

#### Replication Bayes factors from evidence updating

We describe a general method that allows experimenters to quantify the evidence from the data of a direct replication attempt given data already acquired from an original study. These so-called replication Bayes factors are a reconceptualization of the ones introduced by Verhagen and Wagenmakers (Journal of Experimental Psychology: General, 143(4), 1457–1475 2014) for the common...

#### Sit still and pay attention: Using the Wii Balance-Board to detect lapses in concentration in children during psychophysical testing

During psychophysical testing, a loss of concentration can cause observers to answer incorrectly, even when the stimulus is clearly perceptible. Such lapses limit the accuracy and speed of many psychophysical measurements. This study evaluates an automated technique for detecting lapses based on body movement (postural instability). Thirty-five children (8–11 years of age) and 34...

#### The EU-Emotion Voice Database

In this study, we report the validation results of the EU-Emotion Voice Database, an emotional voice database available for scientific use, containing a total of 2,159 validated emotional voice stimuli. The EU-Emotion voice stimuli consist of audio-recordings of 54 actors, each uttering sentences with the intention of conveying 20 different emotional states (plus neutral). The...

#### The Visual Analogue Scale for Rating, Ranking and Paired-Comparison (VAS-RRP): A new technique for psychological measurement

Traditionally, the visual analogue scale (VAS) has been proposed to overcome the limitations of ordinal measures from Likert-type scales. However, the function of VASs to overcome the limitations of response styles to Likert-type scales has not yet been addressed. Previous research using ranking and paired comparisons to compensate for the response styles of Likert-type scales...

#### Picture perfect: A stimulus set of 225 pairs of matched clipart and photographic images normed by Mechanical Turk and laboratory participants

The present study provides normative measures for a new stimulus set of images consisting of 225 everyday objects, each depicted both as a photograph and a matched clipart image generated directly from the photograph (450 images total). The clipart images preserve the same scale, shape, orientation, and general color features as the corresponding photographs. Various norms (modal...

#### Development and validation of a high-speed stereoscopic eyetracker

Traditional video-based eyetrackers require participants to perform an individual calibration procedure, which involves the fixation of multiple points on a screen. However, certain participants (e.g., people with oculomotor and/or visual problems or infants) are unable to perform this task reliably. Previous work has shown that with two cameras one can estimate the orientation...

#### Modeling competence development in the presence of selection bias

A major challenge for representative longitudinal studies is panel attrition, because some respondents refuse to continue participating across all measurement waves. Depending on the nature of this selection process, statistical inferences based on the observed sample can be biased. Therefore, statistical analyses need to consider a missing-data mechanism. Because each missing...

#### Safe and sensible preprocessing and baseline correction of pupil-size data

Measurement of pupil size (pupillometry) has recently gained renewed interest from psychologists, but there is little agreement on how pupil-size data is best analyzed. Here we focus on one aspect of pupillometric analyses: baseline correction, i.e., analyzing changes in pupil size relative to a baseline period. Baseline correction is useful in experiments that investigate the...

#### General mixture item response models with different item response structures: Exposition with an application to Likert scales

This article proposes a general mixture item response theory (IRT) framework that allows for classes of persons to differ with respect to the type of processes underlying the item responses. Through the use of mixture models, nonnested IRT models with different structures can be estimated for different classes, and class membership can be estimated for each person in the sample...

#### Correction to: Estimating effect size when there is clustering in one treatment group

Equation (26) is formatted incorrectly in the pdf version. It should appear as follows.

#### All for one or some for all? Evaluating informative hypotheses using multiple N = 1 studies

Analyses are mostly executed at the population level, whereas in many applications the interest is on the individual level instead of the population level. In this paper, multiple N = 1 experiments are considered, where participants perform multiple trials with a dichotomous outcome in various conditions. Expectations with respect to the performance of participants can be...

#### PredPsych: A toolbox for predictive machine learning-based approach in experimental psychology research

Recent years have seen an increased interest in machine learning-based predictive methods for analyzing quantitative behavioral data in experimental psychology. While these methods can achieve relatively greater sensitivity compared to conventional univariate techniques, they still lack an established and accessible implementation. The aim of current work was to build an open...

#### Semantic ambiguity effects on traditional Chinese character naming: A corpus-based approach

Words are considered semantically ambiguous if they have more than one meaning and can be used in multiple contexts. A number of recent studies have provided objective ambiguity measures by using a corpus-based approach and have demonstrated ambiguity advantages in both naming and lexical decision tasks. Although the predictive power of objective ambiguity measures has been...

#### The Bangor Voice Matching Test: A standardized test for the assessment of voice perception ability

Recognising the identity of conspecifics is an important yet highly variable skill. Approximately 2 % of the population suffers from a socially debilitating deficit in face recognition. More recently the existence of a similar deficit in voice perception has emerged (phonagnosia). Face perception tests have been readily available for years, advancing our understanding of...

#### Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees

Identification of subgroups of patients for whom treatment A is more effective than treatment B, and vice versa, is of key importance to the development of personalized medicine. Tree-based algorithms are helpful tools for the detection of such interactions, but none of the available algorithms allow for taking into account clustered or nested dataset structures, which are...

#### Is human classification by experienced untrained observers a gold standard in fixation detection?

Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be due to mistakes of the algorithm. However, little is known about human...

#### Examining reproducibility in psychology: A hybrid method for combining a statistically significant original study and a replication

The unrealistically high rate of positive results within psychology has increased the attention to replication research. However, researchers who conduct a replication and want to statistically combine the results of their replication with a statistically significant original study encounter problems when using traditional meta-analysis techniques. The original study’s effect...

#### The Box Task: A tool to design experiments for assessing visuospatial working memory

The present paper describes the Box Task, a paradigm for the computerized assessment of visuospatial working memory. In this task, hidden objects have to be searched by opening closed boxes that are shown at different locations on the computer screen. The set size (i.e., number of boxes that must be searched) can be varied and different error scores can be computed that measure...

#### GestuRe and ACtion Exemplar (GRACE) video database: stimuli for research on manners of human locomotion and iconic gestures

Human locomotion is a fundamental class of events, and manners of locomotion (e.g., how the limbs are used to achieve a change of location) are commonly encoded in language and gesture. To our knowledge, there is no openly accessible database containing normed human locomotion stimuli. Therefore, we introduce the GestuRe and ACtion Exemplar (GRACE) video database, which contains...

#### The cognitive reflection test is robust to multiple exposures

The cognitive reflection test (CRT) is a widely used measure of the propensity to engage in analytic or deliberative reasoning in lieu of gut feelings or intuitions. CRT problems are unique because they reliably cue intuitive but incorrect responses and, therefore, appear simple among those who do poorly. By virtue of being composed of so-called “trick problems” that, in theory...