## Probabilistic algorithms (including Monte Carlo)

### Refine

#### Keywords

- machine learning (2)
- Artificial Intelligence (1)
- Association (1)
- Bewertung (1)
- Evolutionary Algorithm (1)
- Least-squares Monte Carlo method (1)
- Lebensversicherung (1)
- Linked Data (1)
- Machine Learning (1)
- Maschinelles Lernen (1)

#### Faculty / Organisational entity

Life insurance companies are asked by the Solvency II regime to retain capital requirements against economically adverse developments. This ensures that they are continuously able to meet their payment obligations towards the policyholders. When relying on an internal model approach, an insurer's solvency capital requirement is defined as the 99.5% value-at-risk of its full loss probability distribution over the coming year. In the introductory part of this thesis, we provide the actuarial modeling tools and risk aggregation methods by which the companies can accomplish the derivations of these forecasts. Since the industry still lacks the computational capacities to fully simulate these distributions, the insurers have to refer to suitable approximation techniques such as the least-squares Monte Carlo (LSMC) method. The key idea of LSMC is to run only a few wisely selected simulations and to process their output further to obtain a risk-dependent proxy function of the loss. We dedicate the first part of this thesis to establishing a theoretical framework of the LSMC method. We start with how LSMC for calculating capital requirements is related to its original use in American option pricing. Then we decompose LSMC into four steps. In the first one, the Monte Carlo simulation setting is defined. The second and third steps serve the calibration and validation of the proxy function, and the fourth step yields the loss distribution forecast by evaluating the proxy model. When guiding through the steps, we address practical challenges and propose an adaptive calibration algorithm. We complete with a slightly disguised real-world application. The second part builds upon the first one by taking up the LSMC framework and diving deeper into its calibration step. After a literature review and a basic recapitulation, various adaptive machine learning approaches relying on least-squares regression and model selection criteria are presented as solutions to the proxy modeling task. The studied approaches range from ordinary and generalized least-squares regression variants over GLM and GAM methods to MARS and kernel regression routines. We justify the combinability of the regression ingredients mathematically and compare their approximation quality in slightly altered real-world experiments. Thereby, we perform sensitivity analyses, discuss numerical stability and run comprehensive out-of-sample tests. The scope of the analyzed regression variants extends to other high-dimensional variable selection applications. Life insurance contracts with early exercise features can be priced by LSMC as well due to their analogies to American options. In the third part of this thesis, equity-linked contracts with American-style surrender options and minimum interest rate guarantees payable upon contract termination are valued. We allow randomness and jumps in the movements of the interest rate, stochastic volatility, stock market and mortality. For the simultaneous valuation of numerous insurance contracts, a hybrid probability measure and an additional regression function are introduced. Furthermore, an efficient seed-related simulation procedure accounting for the forward discretization bias and a validation concept are proposed. An extensive numerical example rounds off the last part.

In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base.
Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving.
This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases.
The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research.
The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.