Although risk is pervasive in personal life as well as in public
and commerce, little research has been conducted about how it is
linguistically expressed. Knowing the varying forms of expressions
of risk can help build computational models of extracting risk mentions
automatically, which has important applications, such as risk intelligence
and decision support systems or reducing the number and
impact of risks that materialize and harm society.
In this paper, we present a corpus-based study of the different
ways how risk exposure is articulated in news stories. Using a
machine-learning derived taxonomy of approx. 4,000 keywords or phrases
that indicate risk mentions, a random sample of N=5,000 passages
discussing risk exposure to individuals or companies are
semi-automatically extracted, manually reviewed and categorized
with respect to how risk is linguistically expressed (for example, in
phrasal form or in sentential form; in nominal or in verbal form).
We conclude with reporting the observed relative frequency
distribution of all forms observed in our corpus and with suggestions
for future work. We are not aware of previous studies investigating
the linguistic structure of how risks are expressed in news.