Modifiable Temporal Unit Problem (MTUP) and Its Effect on Space-Time Cluster Detection
Citation: Cheng T, Adepeju M (
Modifiable Temporal Unit Problem (MTUP) and Its Effect on Space-Time Cluster Detection
Tao Cheng 0
Monsuru Adepeju 0
Tobias Preis, University of Warwick, United Kingdom
0 SpaceTimeLab, Department of Civil, Environmental and Geomatic Engineering, University College London , Gower Street, WC1E 6BT London, the United Kingdom
Background: When analytical techniques are used to understand and analyse geographical events, adjustments to the datasets (e.g. aggregation, zoning, segmentation etc.) in both the spatial and temporal dimensions are often carried out for various reasons. The 'Modifiable Areal Unit Problem' (MAUP), which is a consequence of adjustments in the spatial dimension, has been widely researched. However, its temporal counterpart is generally ignored, especially in space-time analysis. Methods: In analogy to MAUP, the Modifiable Temporal Unit Problem (MTUP) is defined as consisting of three temporal effects (aggregation, segmentation and boundary). The effects of MTUP on the detection of space-time clusters of crime datasets of Central London are examined using Space-Time Scan Statistics (STSS). Results and Conclusion: The case study reveals that MTUP has significant effects on the space-time clusters detected. The attributes of the clusters, i.e. temporal duration, spatial extent (size) and significance value (p-value), vary as the aggregation, segmentation and boundaries of the datasets change. Aggregation could be used to find the significant clusters much more quickly than at lower scales; segmentation could be used to understand the cyclic patterns of crime types. The consistencies of the clusters appearing at different temporal scales could help in identifying strong or 'true' clusters.
-
In recent years, the advancement in geographical data
collection techniques (e.g. Computer Aided Dispatch Systems
(CAD), portable sensors etc.) has brought about exponential
growth in the availability of geographic data at small space and
time scales. This trend of data availability is now observed in many
application domains including criminology, epidemiology, and
transport, to mention but a few. The time stamp in these datasets
provides opportunities to mine intrinsic properties of spatial events
in relation to time. Hence, attention is shifting from purely spatial
analysis to spacetime analysis. Research efforts are now focussing
on developing techniques to mine the space-time complexities
within the datasets in order to further understand the dynamics
underlying geographic events [1,2].
Observations of discrete geographic data are usually made at
point locations, but are often aggregated into areal units for
various reasons, such as confidentiality of individual records, data
summary or to fit into an existing zoning system (e.g. districts,
service areas, police beats etc.). Spatial aggregation however,
requires consideration of problems such as the Modifiable Areal
Unit Problem (MAUP) and the ecological fallacy, which have been
widely discussed in the literature [36]. Recently, the term MTUP
(Modifiable Temporal Unit Problem) has been mentioned in a
number of studies in analogy to MAUP [7,8], with major focus on
temporal aggregation (scales) and its effects on statistical inference
[913]. However, other issues relating to the temporal dimension,
such as the manner in which the temporal dimension is divided
(segmentation) or adjustments to the temporal extent (boundary) of
a time series, have received less attention.
Analogous to the zonation effect in the spatial dimension [14],
temporal segmentation may be viewed as the situation whereby
the analyst is open to a number of choices as to how the temporal
dimension can be discretised into temporal units. Commonly used
implementations of segmentation in large databases were
examined in [15], and found to often produce disparate results. One
important factor affecting the frequency distribution of a
segmented dataset is the selection of the starting phase of temporal
segmentation. It was further demonstrated that the selection of the
starting phase of temporal segmentation influences the estimation
of regression model parameters [8]. In discrete data segmentation,
for example, mid-night or mid-day may be considered as the
starting point of daily observations, while weekly aggregation may
start from Sunday or Monday. In any case, the basic statistical
estimates such as mean, variance and so on are bound to change
[16].
The boundary problem is a concept mostly associated with the
spatial dimension [17]. However, it was argued that the boundary
Figure 1. Modifiable Temporal Unit Problem (MTUP) (a) Temporal aggregation (b) Temporal Segmentation (c) Temporal boundary.
doi:10.1371/journal.pone.0100465.g001
problem occurs not only in horizontal boundaries but also in
vertically drawn boundaries such as time, depth and temperature
[18]. In temporal data, the boundary is the temporal frame within
which observations of a proce (...truncated)