Complex networks and public funding: the case of the 2007-2013 Italian program
Nicotri et al. EPJ Data Science (2015) 4:8
DOI 10.1140/epjds/s13688-015-0047-z
REGULAR ARTICLE
Open Access
Complex networks and public funding:
the case of the 2007-2013 Italian program
Stefano Nicotri1* , Eufemia Tinelli2,3 , Nicola Amoroso1,2 , Elena Garuccio4 and Roberto Bellotti1,2
*
Correspondence:
1
Istituto Nazionale di Fisica Nucleare
- Sezione di Bari, via Orabona 4, Bari,
I-70125, Italy
Full list of author information is
available at the end of the article
Abstract
In this paper we apply techniques of complex network analysis to data sources
representing public funding programs and discuss the importance of the considered
indicators for program evaluation. Starting from the Open Data repository of the
2007-2013 Italian Program Programma Operativo Nazionale ‘Ricerca e Competitività’
(PON R&C), we build a set of data models and perform network analysis over them.
We discuss the obtained experimental results outlining interesting new perspectives
that emerge from the application of the proposed methods to the socio-economical
evaluation of funded programs.
Keywords: public funding; open data; complex networks; program evaluation
1 Introduction
Since the last years of the past century, the importance of basing policies on evidence, data,
and analysis has quickly spread all over the world. The Evidence-Based Policy movement
[–] has grown enormously, and mainly all public administrations are now focused on
maximising utility and show a pragmatic problem-solving approach to socio-economical
issues []. In this respect, the evaluation of public funding programs is a field of great interest for policymakers and economists. Politicians and technicians need to estimate the
impact that funding has on life and society, in order to address future programs and to
modify their decisions. Many standard and advanced statistical methods are commonly
used for this purpose, such as linear/nonlinear regressions, Bayesian inference, machine
learning, data mining, and so on. In this paper we suggest new indicators, coming from
network analysis, that can help underlining in a quantitative way important effects that are
not usually considered, being them outside the domain of investigation of standard statistical tools. This does certainly not mean that program evaluation cannot be performed
without including network analysis, but that valuable insight about public funding programs could hopefully be inferred from such techniques, in order to help increasing objectivity of the extracted results. Recently, a growing interest towards complex network
analysis applied to evaluation can be seen both in literature [–] and institutional reports []. The indicators we suggest can be used by experts in program evaluation for
their analyses, giving them the opportunity of considering and quantitatively measuring
important features of the funding programs, such as relations between the actors involved
in them. Social network analysis is a particularly suitable tool to extract information about
© 2015 Nicotri et al. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and
indicate if changes were made.
Nicotri et al. EPJ Data Science (2015) 4:8
Page 2 of 19
relations among the different components of a system. Investigating the relations between
the actors participating to a program could be of interest, since can e.g. show structural
contradictions in the organisation of the different levels involved []. Considering the set
of projects, research institutions and enterprises that participate to a funding program as a
complex dynamical system, it is possible to identify underlying network structures simply
defining the edges according to some relations among the components that are of interest for the evaluator. Once the network is constructed, global and local properties can be
evaluated and discussed.
From a data collection perspective, the proposed analysis can profit from current emerging technologies and precise guidelines of European governmental institutions to support
initiatives such as Smart Cities & Communitiesa and their co-related action goals (Urban
and Citizen App, e-Government, e-Democracy and so on). All this initiatives have produced
a large number of freely available datasets containing information, collected by national
governments, which third parties are encouraged to use for their scope, analyse and republish as they wish, without restrictions from any copyright. Recently, Open Government
Data (OGD) is emerging as a major movement in knowledge sharing. It promotes transparency and accountability, enables collaboration among stakeholders, encourages novel
socio-economic activities and growing of the so-called network economy. Starting from
the idea that without sharing information it is not possible to establish a culture of collaboration and participation among the relevant stakeholders, the Linked Open Data (LOD)
[]. Movement, which provides existing data in a machine-readable format, has gained
large importance over the last years. From a such perspective, LOD facilitates innovation
and knowledge creation from interlinked data, but it also introduces a level of complexity
for information management and integration. Considering a good trade-off between data
expressiveness and computational cost for data analysis, we have selected only Open Data
repository without linked data and RDFb triples. Despite the main aim of such movement
of reaching the largest possible portion of users, our investigation has outlined that such
datasets are usually of heterogeneous quality and size, and that their analysis requires efforts in a pre-processing phase composed of typical ETL (Extract-Transform-Load) []
and data cleaning procedures. It is worth mentioning that problems are commonly encountered while using network analysis for evaluation, like the concern about anonymity
of non-aggregated data (and eventual anonymisation), or the fact that making results public usually interferes with the structure of the network itself []. These kind of problems
are mitigated by using Open Data, since they are public ‘by construction’.
The paper is organised as follows: in Section we introduce the steps composing the
schema of the overall analysis process. In Section we describe the structure of the
open data repository of the - Italian Program Programma Operativo Nazionale
‘Ricerca e Competitività’ (PON R&C), in order to keep the paper self-contained, and introduce the data model for network analysis; in Section we present features and properties
of the analysed network. In order to better discuss the experimental results, we distinguish
among local properties, global pr (...truncated)