REPORT
OF THE IAG SSG 4.190:
'NON-PROBABILISTIC ASSESSMENT IN GEODETIC DATA ANALYSIS'
IAG SCIENTIFIC ASSEMBLY 2001, BUDAPEST
Hansjörg
Kutterer
DGFI
Munich
Marstallplatz
8
D-80539
Munich
E-Mail: kutterer@dgfi.badw.de
1.
Introduction
Geometrical
and physical models can only be approximations of the reality. Hence
the difference between the chosen model and the data remains
uncertain. In Geodesy, these differences are - after some pre-processing
- exclusively considered as random. Mathematically they are treated by
means of stochastics. As a consequence, this proceeding is normative
since the use of stochastic methods restricts in turn the considered
type of uncertainty to random variability of the data. Contrary to the
classical approach there are cases when stochastics is not the
adequate theoretical basis to handle all problem-immanent
uncertainties. Two examples may give an idea. In applications like,
e.g., Real-Time Kinematic Differential GPS, imprecision due to unknown
systematic effects is the most relevant type of uncertainty. Besides,
the common empirics-based formulation of the stochastic model in
adjustment calculus implies a source of non-random uncertainty. Thus,
it is not recommended to consider only random-type uncertainties.
To
establish a general methodology for the comprehensive assessment of
uncertainty in geodetic data analysis it is necessary to identify and
to classify the occuring uncertainties in typical geodetic
applications (qualification of uncertainty in observation, modelling,
and inference). In addition, the elaboration of a proper terminology
and the compilation of a bibliography are required. Within the work of
the IAG SSG 4.190 (SSG) at least three fields of application are
considered: GPS data processing, deformation analysis, and GIS. The
relevant uncertainties have to be quantified regarding the respective
application. The main points of interest are the data handling in the
acquisition and preprocessing steps and the corresponding setup of
models. As an example the uncertainty of GPS results introduced by
different operators and different software packages is mentioned.
Furtheron
it is necessary to collect and to characterize different non-standard
approaches to deal with uncertainty and to infer under uncertainty
like robust statistics, fuzzy theory, possibility theory, evidential
reasoning, etc, in addition to the well-known concepts of
approximation theory and stochastics. The applicability of the
different approaches to the data analysis in the mentioned fields of
geodetic interest needs to be discussed. Looking at the possible
scientific interpretations of the quantities resulting from the data
analysis it is essential to assess the corresponding (types of)
uncertainty qualitatively and numerically.
Undoubtedly,
there is in several cases a competition between the different
approaches. In other cases with a clear distinction between the
immanent uncertainties it is worthwhile to study the combination of
the mathematical approaches for a more adequate use in geodetic
practice. Statistics with data which are both random and imprecise can
be mentioned as an example.
2.
Organizational notes
Up
to now (April 2001) two working meetings of the SSG have been held.
The first meeting took place on April 7, 2000 in Karlsruhe, Germany.
Eleven SSG members participated with oral presentations of their SSG-related
work and discussions. The participation of E. A. Shyllon was funded by
the IAG. This is gratefully acknowledged. On this occasion it was
decided to organize an international symposium on the main topics of
the SSGs work, i.e. robust estimation and fuzzy techniques. This
symposium took place in Zurich, Switzerland, from March 12 to March
16, 2001. A proceedings volume is edited by Carosio and Kutterer
(2001). A second SSG working meeting was held during this symposium.
Further working meetings will take place on a half-annual or annual
basis.
3.
SSG website and mailing list
The
SSG maintains the website www.dgfi.badw.de/ssg4.190 which is updated
regularly. The site contains formal details (terms of reference,
objectives, list of members), information on the work of the SSG
(notes, papers, minutes of the working meetings, Zurich symposium
report, bibliography) and a SSG mailing list. Feedback and criticism
concerning the web presentation of the SSG and the contents of the
website are highly appreciated.
4.
Membership structure
5.
Classification of uncertainty
It
is well-known that the complete procedure of (geodetic) data
management consists of data aquisition, data pre-processing
(reduction of the 'raw' data to fit the geodetic observables which
serve as an interface to the scientific models), inference
(estimation and prediction of model parameters and derived
quantities). Finally, regarding the general objectives of geodetic
work the obtained results are interpreted in a scientific framework.
For a general starting point of uncertainty assessment and management
in the complete procedure, several types of uncertainty have to be
distinguished. In the following, uncertainty is used as a generic
expression. For more details see Kutterer (2001).
The
modelling part of data analysis has to be separated into the set-up of
the measurement or observation model (e.g., application of atmospheric
corrections) and into the set-up of the model of main scientific
interest (e.g., plate-kinematic model). A global distinction is
between uncertainties of the model (or of the concept), uncertainties
of the data (measurements, observations) and uncertainties introduced
by the estimation or inference procedures.
The
classical uncertainty concept in Geodesy is based on three classes of
errors: gross errors, systematic errors, and random errors. Gross
errors have to be avoided or detected by control methods, whereas
systematic errors have to be eliminated by the observation set-up and
correction methods. The remaining errors are considered as random.
Thus, the distinction between randomness and systematics is based on
the observation frame: Only those systematic errors are eliminated
that can be modelled mathematically, whereas the others are neglected.
The
decision about an observation value being biased by a gross error is
usually based on human experiences, machine threshold values, or
critical values of statistical tests. Therefore, there is some
imprecision or fuzziness in the concept of gross errors. It should be
noticed that the uncertainty of models or concepts is not considered
in classical Geodesy. Nevertheless, there are uncertainties of the
model because of the incomplete (human) knowledge (modelling of the
'state of the art'), necessary simplifications due to the complexity
of the real world (naming of and restriction to the relevant
characteristics), modelling of a substitute situation (discretization
of continuous objects and processes), fuzziness or imprecision of
linguistic expressions or descriptions ('gross error', 'high
temperature'), imprecision or inaccuracy of some 'known' model
parameters, ambiguity (non-uniqueness in a crisp sense), or vagueness
(non-uniqueness in a fuzzy sense, non-specificity).
Uncertainties
of the data are due to the random selection of the data, the random
variability of the data (central limit theorems), imprecision of the
observation procedure and instruments (round-off errors, recording of
correction data), lacking reliability of the data, reduced credibility
of the data (data are recorded reliably, but their adequacy for the
modelled situation is questionable), data gaps, or lacking consistency
of data coming from different sources.
Uncertainties
of the estimation or inference procedures result from simplifications
for (convenient) mathematical treatment (e.g., linearized models),
(ambiguous) choice of the optimum principle of parameter estimation,
or decisions based on discrete alternatives and on threshold values.
As
a pragmatic matter of fact, the uncertainty of the uncertainties
(uncertainty modelling) can additionally be taken into account. This
comprises the uncertainty model for the observed values, the
uncertainty model for the introduced prior information and the
uncertainty model for the scientific (geodetic) model.
6. Mathematical
theories for the assessment of uncertainty
Mathematical
theories which are adequate for (at least) some parts of uncertainty
modelling and handling can be separated into theories which are more
or less based on the theory of probability and into theories which are
not. The approximation theory is the most fundamental approach since
uncertainty is considered in terms of approximation errors which are
minimized by minimizing a suitable measure for the distance between
model and data. Probabilistic theories are the theory of stochastics
with uncertainty modelled by means of random variables, the Bayes
theory allowing the use of stochastic (sometimes subjective) prior
knowledge (Koch, 1990), and the evidence theory (Shafer, 1976) or
theory of hints (Kohlas and Monney, 1995), repectively. These last two
theories are more or less identical. They can be understood as a
generalization of the Bayes theory; uncertain prior knowledge is
modelled and assessed using credibility and plausibility measures.
Finally, robust statistics has to be settled between pure
approximation theory and stochastics.
Non-probabilistic
theories are interval mathematics (Alefeld and Herzberger, 1983),
fuzzy theory (Dubois and Prade, 1980), possibility theory (Dubois and
Prade, 1988), the theory of rough sets or artificial neural networks.
Interval mathematics allows to consider imprecise data whereas fuzzy
theory comprises both fuzziness (or imprecision) of the model and of
the data. The main branches of fuzzy theory are fuzzy logic and fuzzy
data analysis. The latter can be understood as generalization of
interval mathematics. As a perspective, there are approaches to
combine probabilistic and non -probabilistic approaches like, e.g., by
Viertl (1996) who develops a statistics for imprecise data with
extensions to Bayesian statistics.
The above-mentioned
mathematical theories are (partly) different in the way of modelling
and assessing the specific uncertainty. For example, there is no
difference between approximation theory and stochastics or robust
statistics, respectively, if only a best-fit is needed. But there is a
big difference if an inference-based decision (like e.g., outlier
rejection) is required because a criterion has to be specified. Thus,
there is a need in geodetic data analysis for the selection of the
adequate kind of mathematics, for the definition of particular
measures of uncertainty, and for the combination of the most suitable
mathematical theories if several types of uncertainty occur in the
applications. For further information and for a extended list of
references see the SSG website. Within the SSG the main focus is on
robust statistics and on geodetic applications of both fuzzy logic and
fuzzy data analysis to handle classical model-data deviations in
general and to consider (non-random) data and model imprecision.
7. Registration
of uncertainty
Aiming at the
assessment and management of uncertainty in typical geodetic data
analysis it is indispensable to register, to characterize, and to
categorize the essential components and steps. The set-up of a
corresponding questionnaire is the key to the assessment of
uncertainty. It can serve as a basis for the improvement of particular
procedures in use and for the comparison of procedures.
The main steps
of each geodetic data analysis are data acquisition, data
pre-processing, and inference. Besides the analysis, a general
description is needed as a frame for the questionnaire to identify the
specific application and to make the results comparable with others.
Finally, conclusions have to be drawn on the consistency of the data
processing and analysis, on the adequate treatment of the existing
types of uncertainty, and on the assessment of the data acquisition
and analysis procedure in use. Usually, the results of a data analysis
are interpreted scientifically. Thus, their genesis has to be
understood thoroughly. This means particularly the sources for and the
propagation of the immanent and the introduced types of uncertainty.
A proposed
questionnaire can be found on the SSG webpages. Such a questionnaire
is recommended as a basis for the assessment of routine data analysis
like in the IAG data services. This could help to get a deeper
understanding of the data products to be used or interpreted.
8.
Status quo and future work
Information
concerning the first two items of the SSG objectives is now available:
The relevant types of uncertainty are characterized; a variety of
mathematical methods exists which are more or less elaborated for use
in geodetic data analysis. The 'First Symposium on Robust Statistics
and Fuzzy Techniques' in March 2001 in Zurich which was organized by
the SSG showed improvements of robust estimation techniques mainly for
geodetic networks but also for the analysis of real time GPS phase
data. Applications of fuzzy theory to deformation analysis, to GPS
ambiguity resolution and to modelling in GIS were presented; see the
proceedings for details. Within the SSG there will be further
application directed studies on robust statistics, Bayes theory,
interval mathematics, fuzzy theory, and artificial neural networks. A
prominent task of the SSG for the period from 2001 to 2003 is the
comparison of the applicability of different mathematical theories for
uncertainty assessment to particular data analytical problems like,
e.g., temporal or spatial prediction.
References
Alefeld
G.; Herzberger J. (1983): Introduction to Interval Computations.
Academic Press, New York.
Carosio
A.; Kutterer H. (Ed.) (2001):
Proceedings of the First International Symposium on Robust
Statistics and Fuzzy Techniques in Geodesy and GIS. Swiss
Federal Institute of Technology Zurich, Institute of Geodesy and
Photogrammetry - Report No. 295.
Dubois
D.; Prade H. (1980): Fuzzy Sets and Systems. Academic Press, New York.
Dubois
D.; Prade H. (1988): Possibility Theory, Plenum Press, New York.
Koch
K. R. (1990): Bayesian Inference with Geodetic Applications. Springer,
Berlin Heidelberg New York.
Kohlas
J.; Monney P.-A. (1995): A Mathematical Theory of Hints. Springer,
Berlin Heidelberg.
Kutterer
H. (2001): Uncertainty assessment in geodetic data analysis. In:
Carosio, A. und H. Kutterer (Ed.)
(2001)
Shafer
G. (1976): A Mathematical Theory of Evidence. Princeton University
Press, Princeton.
Viertl
R. (1996): Statistical Methods for Non-Precise Data. CRC Press,
Boca Raton New York London Tokyo.