KC-GEE: knowledge-based conditioning for generative event extraction
World Wide Web
https://doi.org/10.1007/s11280-023-01216-5
KC-GEE: knowledge-based conditioning for generative event
extraction
Tongtong Wu1,2 · Fatemeh Shiri2 · Jingqi Kang1 · Guilin Qi1 · Gholamreza Haffari2 ·
Yuan-Fang Li2
Received: 21 October 2022 / Revised: 4 September 2023 / Accepted: 1 October 2023
© The Author(s) 2023
Abstract
Event extraction is an important, but challenging task. Many existing techniques decompose
it into event and argument detection/classification subtasks, which are complex structured
prediction problems. Generation-based extraction techniques lessen the complexity of the
problem formulation and are able to leverage the reasoning capabilities of large pretrained
language models. However, they still suffer from poor zero-shot generalizability and are ineffective in handling long contexts such as documents. We propose a generative event extraction
model, KC-GEE, that addresses these limitations. A key contribution of KC-GEE is a novel
knowledge-based conditioning technique that injects the schema of candidate event types
as the prefix into each layer of an encoder-decoder language model. This enables effective
zero-shot learning and improves supervised learning. Our experiments on two benchmark
datasets demonstrate the strong performance of our KC-GEE model. It achieves particularly
strong results in the challenging document-level extraction task and in the zero-shot learning
setting, outperforming state-of-the-art models by up to 5.4 absolute F1 points.
Keywords Event extraction · Information extraction · Zero-shot learning · Document-level
event extraction
1 Introduction
Event extraction [1] aims at extracting structured event records from unstructured text. For
example, as shown in Figure 1, the goal of event extraction is to map the document “Two
homemade pressure-cooker bombs are detonated remotely by the Tsarnaevs near the finish
line of the Boston Marathon, killing three and injuring some 260 others. Seventeen people
lost limbs.” to four predefined event types (highlighted with celeste), such as <event type:
Attack, trigger word: detonated, role:Attacker: Tsarnaevs, . . . , role:ExplosiveDevice: bombs,
role:Place: Boston Marathon>, as well as other events that are triggered by words killing
and injuring.
B Yuan-Fang Li
Extended author information available on the last page of the article
123
World Wide Web
Figure 1 The event extraction task. In each event schema, we delineate the event type along with its associated
roles. For instance, within the "Attack" event schema, roles such as "Attacker," "ExplosiveDevice," and "Place"
are encompassed
Event extraction is challenging due to the diversity of natural language expressions and
the complexity of event structures. These challenges are amplified in document-level event
extraction where the text is a full document and typically contains more events. Currently,
most event extraction methods employ a decomposition-based approach [2], which involves
breaking down the structured prediction problem of a complex event into classifications of
substructures like trigger detection, entity recognition, and argument classification. Many of
these methods tackle the subproblems separately, which necessitates additional annotations
for each stage [3].
Natural language generation techniques have been successfully applied to a number of
NLP tasks [4–6]. These techniques have inspired the use of controlled event generation to
tackle event extraction. These approaches use manually designed templates to wrap input sentences and train a model for cloze-style filling. The study by [7] proposes generating linearised
event records via a pretrained encoder-decoder architecture combined with a constrained
decoding mechanism that alleviates the complexity associated with template combination
when extracting multiple events. The advantage of the extraction-as-generation approach is
the removal of the need for fine-grained token-level annotations, which are typically used in
previous event extraction approaches [8], thus enjoying greater feasibility.
Although good generalizability has been achieved for other tasks, we have observed a
significant decrease in performance when it comes to generation-based event extraction over
documents or unseen event types. Structured prediction tasks, such as event extraction, often
rely on an external schema to format the output, whereas natural language generation tasks do
not. To bridge this gap, we introduce a novel technique called knowledge-based conditioning.
This approach involves injecting event type information as prefixes on different layers of the
underlying pretrained language model. By incorporating this information, we aim to improve
the performance of event extraction tasks. Additionally, to address the challenge of adapting
to new scenarios, we consider event extraction from the perspective of zero-shot learning [9,
10]. Our model, KC-GEE, is capable of document-level event extraction and is generalizable
to the zero-shot setting.
Our main contributions are as follows.
• We propose a novel knowledge-based conditioning technique that injects event type
information into the model, enabling zero-shot learning capability.
123
World Wide Web
• We carefully design a prefix-based injection mechanism that incorporates cross-attention
to improve document-level event extraction.
• We conducted extensive experiments on two benchmark datasets, in both fully supervised
and zero-shot settings. Our evaluation consistently shows strong performance across
all settings. In particular, our model achieves substantial superiority in the challenging
settings of document-level event extraction and zero-shot transfer, outperforming stateof-the-art models by up to 5.4 absolute F1 points.
2 Related work
Document-level event extraction Event extraction is a task that extracts structured event
records from unstructured text [5]. Many approaches have been proposed for sentencelevel event extraction [11, 12], ranging from hand-designed features [13] and neural-learned
features [14, 15]. Yet, many real-world applications require document-level event extraction [14–18], in which the information of an event may be mentioned in multiple
sentences [19, 20]. Moreover, most work adopt decomposition strategies in event extraction [2], which employ trigger detection [13], entity recognition [21, 22], and argument
classification [23]. These decomposition strategies have shown high performance while introducing more detailed annotation to model training [5, 7].
Zero-shot event extraction Several previous supervised event extraction methods have
relied on features derived from manual annotations, limiting their applicability to new event
types without additional annotation effort [9, 24, 25]. These methods often struggle to effectively generalize to new label taxonomies and domains. In contrast, [26] proposes a zero-shot
event extraction approach. They first utilize existing tools, such as (...truncated)