People

Latest Research Publications:
When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.
@article{484, author = {Arthur Venter, Marthinus Theunissen, Marelie Davel}, title = {Pre-interpolation loss behaviour in neural networks}, abstract = {When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.}, year = {2020}, journal = {Communications in Computer and Information Science}, volume = {1342}, pages = {296-309}, publisher = {Southern African Conference for Artificial Intelligence Research}, address = {South Africa}, isbn = {978-3-030-66151-9}, doi = {https://doi.org/10.1007/978-3-030-66151-9_19}, }
The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance trade off in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework.We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically,we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally,we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.
@article{394, author = {Marthinus Theunissen, Marelie Davel, Etienne Barnard}, title = {Benign interpolation of noise in deep learning}, abstract = {The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance trade off in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework.We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically,we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally,we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.}, year = {2020}, journal = {South African Computer Journal}, volume = {32}, pages = {80-101}, issue = {2}, publisher = {South African Institute of Computer Scientists and Information Technologists}, isbn = {ISSN: 1015-7999; E:2313-7835}, doi = {https://doi.org/10.18489/sacj.v32i2.833}, }
A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.
@{236, author = {Marelie Davel, Marthinus Theunissen, Arnold Pretorius, Etienne Barnard}, title = {DNNs as layers of cooperating classifiers}, abstract = {A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.}, year = {2020}, journal = {The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)}, pages = {3725 - 3732}, month = {07/02-12/02/2020}, address = {New York}, }
The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.
@{284, author = {Marthinus Theunissen, Marelie Davel, Etienne Barnard}, title = {Insights regarding overfitting on noise in deep learning}, abstract = {The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {49-63}, address = {Cape Town, South Africa}, }
Latest Research Publications:

TALKS:
1) 'Toward a Coherent Account of Moral Agency' (FAIR 2019);
2) 'Functional Moral Agency' (4IR: Philosophical, Ethical and Legal Perspectives 2019).
PUBLICATIONS:
1) Tollon, Fabio. 2020. The Artificial View: toward a non-anthropocentric account of moral patiency. Ethics and Information Technology. https://doi.org/10.1007/s10676-020-09540-4;
2) Tollon, Fabio. 2019. Moral Agents or Mindless Machines? A critical appraisal of agency in artificial systems. Hungarian Philosophical Review 63(4), pp. 9-23;
3) Tollon, Fabio. 2019. Toward a Coherent Account of Moral Agency. Proceedings of the South African Forum for Artificial Intelligence Research. Vol 2540. http://ceur-ws.org/Vol-2540/.
Latest Research Publications:
Artifcial Intelligence (AI) systems are ubiquitous. From social media timelines, video recommendations on YouTube, and the kinds of adverts we see online, AI, in a very real sense, flters the world we see. More than that, AI is being embedded in agent-like systems, which might prompt certain reactions from users. Specifcally, we might fnd ourselves feeling frustrated if these systems do not meet our expectations. In normal situations, this might be fne, but with the ever increasing sophistication of AI-systems, this might become a problem. While it seems unproblematic to realize that being angry at your car for breaking down is unfitting, can the same be said for AI-systems? In this paper, therefore, I will investigate the so-called “reactive attitudes”, and their important link to our responsibility practices. I then show how within this framework there exist exemption and excuse conditions, and test whether our adopting the “objective attitude” toward agential AI is justifed. I argue that such an attitude is appropriate in the context of three distinct senses of responsibility (answerability, attributability, and accountability), and that, therefore, AI-systems do not undermine our responsibility ascriptions.
@article{487, author = {Fabio Tollon}, title = {Responsibility gaps and the reactive attitudes}, abstract = {Artifcial Intelligence (AI) systems are ubiquitous. From social media timelines, video recommendations on YouTube, and the kinds of adverts we see online, AI, in a very real sense, flters the world we see. More than that, AI is being embedded in agent-like systems, which might prompt certain reactions from users. Specifcally, we might fnd ourselves feeling frustrated if these systems do not meet our expectations. In normal situations, this might be fne, but with the ever increasing sophistication of AI-systems, this might become a problem. While it seems unproblematic to realize that being angry at your car for breaking down is unfitting, can the same be said for AI-systems? In this paper, therefore, I will investigate the so-called “reactive attitudes”, and their important link to our responsibility practices. I then show how within this framework there exist exemption and excuse conditions, and test whether our adopting the “objective attitude” toward agential AI is justifed. I argue that such an attitude is appropriate in the context of three distinct senses of responsibility (answerability, attributability, and accountability), and that, therefore, AI-systems do not undermine our responsibility ascriptions.}, year = {2022}, journal = {AI and Ethics}, publisher = {Springer}, url = {https://link.springer.com/article/10.1007/s43681-022-00172-6}, doi = {https://doi.org/10.1007/s43681-022-00172-6}, }
Up to 70% of all watch time on YouTube is due to the suggested content of its recommender system. This system has been found, by virtue of its design, to be promoting conspiratorial content. In this paper, the author firstly critiques the value neutrality thesis regarding technology, showing it to be philosophically untenable. This means that technological artefacts can influence what people come to value (or perhaps even embody values themselves) and change the moral evaluation of an action. Secondly, he introduces the concept of an affordance, borrowed from the literature on ecological psychology. This concept allows him to make salient how technologies come to solicit certain kinds of actions from users, making such actions more or less likely, and in this way influencing the kinds of things one comes to value. Thirdly, he critically assesses the results of a study by Alfano et al. He makes use of the literature on affordances, introduced earlier, to shed light on how these technological systems come to mediate our perception of the world and influence action.
@article{415, author = {Fabio Tollon}, title = {Designed to Seduce: Epistemically Retrograde Ideation and YouTube's Recommender System}, abstract = {Up to 70% of all watch time on YouTube is due to the suggested content of its recommender system. This system has been found, by virtue of its design, to be promoting conspiratorial content. In this paper, the author firstly critiques the value neutrality thesis regarding technology, showing it to be philosophically untenable. This means that technological artefacts can influence what people come to value (or perhaps even embody values themselves) and change the moral evaluation of an action. Secondly, he introduces the concept of an affordance, borrowed from the literature on ecological psychology. This concept allows him to make salient how technologies come to solicit certain kinds of actions from users, making such actions more or less likely, and in this way influencing the kinds of things one comes to value. Thirdly, he critically assesses the results of a study by Alfano et al. He makes use of the literature on affordances, introduced earlier, to shed light on how these technological systems come to mediate our perception of the world and influence action.}, year = {2021}, journal = {International Journal of Technoethics (IJT)}, volume = {12}, issue = {2}, publisher = {IGI Global}, isbn = {9781799861492}, url = {https://www.igi-global.com/gateway/article/281077}, doi = {10.4018/IJT.2021070105}, }
In this paper I critically evaluate the value neutrality thesis regarding technology, and find it wanting. I then introduce the various ways in which artifacts can come to influence moral value, and our evaluation of moral situations and actions. Here, following van de Poel and Kroes, I introduce the idea of value sensitive design. Specifically, I show how by virtue of their designed properties, artifacts may come to embody values. Such accounts, however, have several shortcomings. In agreement with Michael Klenk, I raise epistemic and metaphysical issues with respect to designed properties embodying value. The concept of an affordance, borrowed from ecological psychology, provides a more philosophically fruitful grounding to the potential way(s) in which artifacts might embody values. This is due to the way in which it incorporates key insights from perception more generally, and how we go about determining possibilities for action in our environment specifically. The affordance account as it is presented by Klenk, however, is insufficient. I therefore argue that we understand affordances based on whether they are meaningful, and, secondly, that we grade them based on their force.
@article{386, author = {Fabio Tollon}, title = {Artifacts and affordances: from designed properties to possibilities for action}, abstract = {In this paper I critically evaluate the value neutrality thesis regarding technology, and find it wanting. I then introduce the various ways in which artifacts can come to influence moral value, and our evaluation of moral situations and actions. Here, following van de Poel and Kroes, I introduce the idea of value sensitive design. Specifically, I show how by virtue of their designed properties, artifacts may come to embody values. Such accounts, however, have several shortcomings. In agreement with Michael Klenk, I raise epistemic and metaphysical issues with respect to designed properties embodying value. The concept of an affordance, borrowed from ecological psychology, provides a more philosophically fruitful grounding to the potential way(s) in which artifacts might embody values. This is due to the way in which it incorporates key insights from perception more generally, and how we go about determining possibilities for action in our environment specifically. The affordance account as it is presented by Klenk, however, is insufficient. I therefore argue that we understand affordances based on whether they are meaningful, and, secondly, that we grade them based on their force.}, year = {2021}, journal = {AI & SOCIETY Journal of Knowledge, Culture and Communication}, volume = {36}, issue = {1}, publisher = {Springer}, url = {https://link.springer.com/article/10.1007%2Fs00146-021-01155-7}, doi = {https://doi.org/10.1007/s00146-021-01155-7}, }
In this paper I provide an exposition and critique of the Organic View of Ethical Status, as outlined by Torrance (2008). A key presupposition of this view is that only moral patients can be moral agents. It is claimed that because artificial agents lack sentience, they cannot be proper subjects of moral concern (i.e. moral patients). This account of moral standing in principle excludes machines from participating in our moral universe. I will argue that the Organic View operationalises anthropocentric intuitions regarding sentience ascription, and by extension how we identify moral patients. The main difference between the argument I provide here and traditional arguments surrounding moral attributability is that I do not necessarily defend the view that internal states ground our ascriptions of moral patiency. This is in contrast to views such as those defended by Singer (1975, 2011) and Torrance (2008), where concepts such as sentience play starring roles. I will raise both conceptual and epistemic issues with regards to this sense of sentience. While this does not preclude the usage of sentience outright, it suggests that we should be more careful in our usage of internal mental states to ground our moral ascriptions. Following from this I suggest other avenues for further exploration into machine moral patiency which may not have the same shortcomings as the Organic View.
@article{387, author = {Fabio Tollon}, title = {The artifcial view: toward a non‑anthropocentric account of moral patiency}, abstract = {In this paper I provide an exposition and critique of the Organic View of Ethical Status, as outlined by Torrance (2008). A key presupposition of this view is that only moral patients can be moral agents. It is claimed that because artificial agents lack sentience, they cannot be proper subjects of moral concern (i.e. moral patients). This account of moral standing in principle excludes machines from participating in our moral universe. I will argue that the Organic View operationalises anthropocentric intuitions regarding sentience ascription, and by extension how we identify moral patients. The main difference between the argument I provide here and traditional arguments surrounding moral attributability is that I do not necessarily defend the view that internal states ground our ascriptions of moral patiency. This is in contrast to views such as those defended by Singer (1975, 2011) and Torrance (2008), where concepts such as sentience play starring roles. I will raise both conceptual and epistemic issues with regards to this sense of sentience. While this does not preclude the usage of sentience outright, it suggests that we should be more careful in our usage of internal mental states to ground our moral ascriptions. Following from this I suggest other avenues for further exploration into machine moral patiency which may not have the same shortcomings as the Organic View.}, year = {2020}, journal = {Ethics and Information Technology}, volume = {22}, issue = {4}, publisher = {Springer}, url = {https://link.springer.com/article/10.1007%2Fs10676-020-09540-4}, doi = {https://doi.org/10.1007/s10676-020-09540-4}, }
Latest Research Publications:

Latest Research Publications:
The output size problem, for a string-to-tree transducer, is to determine the asymptotic behavior of the function describing the maximum size of output trees, with respect to the length of input strings. We show that the problem to determine, for a given regular expression, the worst-case matching time of a backtracking regular expression matcher, can be reduced to the output size problem. The latter can, in turn, be solved by determining the degree of ambiguity of a non-deterministic finite automaton.
Keywords: string-to-tree transducers, output size, backtracking regular expression matchers, NFA ambiguity
@article{201, author = {Martin Berglund, F. Drewes, Brink van der Merwe}, title = {The Output Size Problem for String-to-Tree Transducers}, abstract = {The output size problem, for a string-to-tree transducer, is to determine the asymptotic behavior of the function describing the maximum size of output trees, with respect to the length of input strings. We show that the problem to determine, for a given regular expression, the worst-case matching time of a backtracking regular expression matcher, can be reduced to the output size problem. The latter can, in turn, be solved by determining the degree of ambiguity of a non-deterministic finite automaton. Keywords: string-to-tree transducers, output size, backtracking regular expression matchers, NFA ambiguity}, year = {2018}, journal = {Journal of Automata, Languages and Combinatorics}, volume = {23}, pages = {19-38}, issue = {1}, publisher = {Institut für Informatik, Justus-Liebig-Universität Giessen}, address = {Germany}, isbn = {2567-3785}, url = {https://www.jalc.de/issues/2018/issue_23_1-3/jalc-2018-019-038.php}, }
Modern regular expression matching software features many extensions, some general while some are very narrowly specied. Here we consider the generalization of adding a class of operators which can be described by, e.g. nite-state transducers. Combined with backreferences they enable new classes of languages to be matched. The addition of nite-state transducers is shown to make membership testing undecidable. Following this result, we study the complexity of membership testing for various restricted cases of the model.
@{199, author = {Martin Berglund, F. Drewes, Brink van der Merwe}, title = {On Regular Expressions with Backreferences and Transducers}, abstract = {Modern regular expression matching software features many extensions, some general while some are very narrowly specied. Here we consider the generalization of adding a class of operators which can be described by, e.g. nite-state transducers. Combined with backreferences they enable new classes of languages to be matched. The addition of nite-state transducers is shown to make membership testing undecidable. Following this result, we study the complexity of membership testing for various restricted cases of the model.}, year = {2018}, journal = {10th Workshop on Non-Classical Models of Automata and Applications (NCMA 2018)}, pages = {1-19}, month = {21/08-22/08}, }
Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, and we analyse the complexity of regular expression matching when using Boost semantics.
@{196, author = {Brink van der Merwe, Martin Berglund, Willem Bester}, title = {Formalising Boost POSIX Regular Expression Matching}, abstract = {Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, and we analyse the complexity of regular expression matching when using Boost semantics.}, year = {2018}, journal = {International Colloquium on Theoretical Aspects of Computing}, pages = {99-115}, month = {17/02}, publisher = {Springer}, isbn = {978-3-030-02508-3}, url = {https://link.springer.com/chapter/10.1007/978-3-030-02508-3_6}, }
No Abstract
@{178, author = {Brink van der Merwe, N. Weideman, Martin Berglund}, title = {Turning evil regexes harmless}, abstract = {No Abstract}, year = {2017}, journal = {Conference of South African Institute of Computer Scientists and Information Technologists (SAICSIT'17)}, month = {26/09-28/09}, publisher = {ACM}, url = {https://dl.acm.org/citation.cfm?id=3129416}, }
Most modern regular expression matching libraries (one of the rare exceptions being Google’s RE2) allow backreferences, operations which bind a substring to a variable allowing it to be matched again verbatim. However, different implementations not only vary in the syntax permitted when using backreferences, but both implementations and definitions in the literature offer up a number of different variants on how backreferences match. Our aim is to compare the various flavors by considering the formal languages that each can describe, resulting in the establishment of a hierarchy of language classes. Beyond the hierarchy itself, some complexity results are given, and as part of the effort on comparing language classes new pumping lemmas are established, and old ones extended to new classes.
@{176, author = {Martin Berglund, Brink van der Merwe}, title = {Regular Expressions with Backreferences Re-examined}, abstract = {Most modern regular expression matching libraries (one of the rare exceptions being Google’s RE2) allow backreferences, operations which bind a substring to a variable allowing it to be matched again verbatim. However, different implementations not only vary in the syntax permitted when using backreferences, but both implementations and definitions in the literature offer up a number of different variants on how backreferences match. Our aim is to compare the various flavors by considering the formal languages that each can describe, resulting in the establishment of a hierarchy of language classes. Beyond the hierarchy itself, some complexity results are given, and as part of the effort on comparing language classes new pumping lemmas are established, and old ones extended to new classes.}, year = {2017}, journal = {The Prague Stringology Conference (PSC 2017)}, pages = {30-41}, month = {28/08-30/08}, address = {Czech Technical University in Prague,}, isbn = {ISBN 978-80-01-06193-0}, }

Latest Research Publications:

Ivan’s main research interest area is logic-based knowledge representation and reasoning in artificial intelligence, with focus on modal and description logics and their applications in non-monotonic reasoning, reasoning about actions and change, and the semantic web.
Latest Research Publications:
We extend the expressivity of classical conditional reasoning by introducing context as a new parameter. The enriched
conditional logic generalises the defeasible conditional setting in the style of Kraus, Lehmann, and Magidor, and allows for a refined semantics that is able to distinguish, for example, between expectations and counterfactuals. In this paper we introduce the language for the enriched logic and define an appropriate semantic framework for it. We analyse which properties generally associated with conditional reasoning are still satisfied by the new semantic framework, provide a suitable representation result, and define an entailment relation based on Lehmann and Magidor’s generally-accepted notion of Rational Closure.
@{430, author = {Giovanni Casini, Tommie Meyer, Ivan Varzinczak}, title = {Contextual Conditional Reasoning}, abstract = {We extend the expressivity of classical conditional reasoning by introducing context as a new parameter. The enriched conditional logic generalises the defeasible conditional setting in the style of Kraus, Lehmann, and Magidor, and allows for a refined semantics that is able to distinguish, for example, between expectations and counterfactuals. In this paper we introduce the language for the enriched logic and define an appropriate semantic framework for it. We analyse which properties generally associated with conditional reasoning are still satisfied by the new semantic framework, provide a suitable representation result, and define an entailment relation based on Lehmann and Magidor’s generally-accepted notion of Rational Closure.}, year = {2021}, journal = {35th AAAI Conference on Artificial Intelligence}, pages = {6254-6261}, month = {02/02/2021-09/02/2021}, publisher = {AAAI Press}, address = {Online}, }
The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic
constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.
@article{433, author = {Katarina Britz, Giovanni Casini, Tommie Meyer, Kody Moodley, Uli Sattler, Ivan Varzinczak}, title = {Principles of KLM-style Defeasible Description Logics}, abstract = {The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.}, year = {2020}, journal = {Transactions on Computational Logic}, volume = {22 (1)}, pages = {1-46}, publisher = {ACM}, url = {https://dl-acm-org.ezproxy.uct.ac.za/doi/abs/10.1145/3420258}, doi = {10.1145/3420258}, }
We present a formal framework for modelling belief change within a non-monotonic reasoning system. Belief change and non-monotonic reasoning are two areas that are formally closely related, with recent attention being paid towards the analysis of belief change within a non-monotonic environment. In this paper we consider the classical AGM belief change operators, contraction and revision, applied to a defeasible setting in the style of Kraus, Lehmann, and Magidor. The investigation leads us to the formal characterisation of a number of classes of defeasible belief change operators. For the most interesting classes we need to consider the problem of iterated belief change, generalising the classical work of Darwiche and Pearl in the process. Our work involves belief change operators aimed at ensuring logical consistency, as well as the characterisation of analogous operators aimed at obtaining coherence—an important notion within the field of logic-based ontologies
@{382, author = {Giovanni Casini, Tommie Meyer, Ivan Varzinczak}, title = {Rational Defeasible Belief Change}, abstract = {We present a formal framework for modelling belief change within a non-monotonic reasoning system. Belief change and non-monotonic reasoning are two areas that are formally closely related, with recent attention being paid towards the analysis of belief change within a non-monotonic environment. In this paper we consider the classical AGM belief change operators, contraction and revision, applied to a defeasible setting in the style of Kraus, Lehmann, and Magidor. The investigation leads us to the formal characterisation of a number of classes of defeasible belief change operators. For the most interesting classes we need to consider the problem of iterated belief change, generalising the classical work of Darwiche and Pearl in the process. Our work involves belief change operators aimed at ensuring logical consistency, as well as the characterisation of analogous operators aimed at obtaining coherence—an important notion within the field of logic-based ontologies}, year = {2020}, journal = {17th International Conference on Principles of Knowledge Representation and Reasoning (KR 2020)}, pages = {213-222}, month = {12/09/2020}, publisher = {IJCAI}, address = {Virtual}, url = {https://library.confdna.com/kr/2020/}, doi = {10.24963/kr.2020/22}, }
In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.
@{247, author = {Katarina Britz, Ivan Varzinczak}, title = {Preferential tableaux for contextual defeasible ALC}, abstract = {In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.}, year = {2019}, journal = {28th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX)}, pages = {39-57}, month = {03/09-05/09}, publisher = {Springer LNAI no. 11714}, isbn = {ISBN 978-3-030-29026-9}, url = {https://www.springer.com/gp/book/9783030290252}, }
Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.
@article{246, author = {Katarina Britz, Ivan Varzinczak}, title = {Contextual rational closure for defeasible ALC}, abstract = {Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.}, year = {2019}, journal = {Annals of Mathematics and Artificial Intelligence}, volume = {87}, pages = {83-108}, issue = {1-2}, isbn = {ISSN: 1012-2443}, url = {https://link.springer.com/article/10.1007/s10472-019-09658-2}, doi = {10.1007/s10472-019-09658-2}, }

Latest Research Publications:
When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.
@article{484, author = {Arthur Venter, Marthinus Theunissen, Marelie Davel}, title = {Pre-interpolation loss behaviour in neural networks}, abstract = {When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.}, year = {2020}, journal = {Communications in Computer and Information Science}, volume = {1342}, pages = {296-309}, publisher = {Southern African Conference for Artificial Intelligence Research}, address = {South Africa}, isbn = {978-3-030-66151-9}, doi = {https://doi.org/10.1007/978-3-030-66151-9_19}, }

Latest Research Publications:
We explore how machine learning (ML) and Bayesian networks (BNs) can be combined in a personal health agent (PHA) for the detection and interpretation of electrocardiogram (ECG) characteristics. We propose a PHA that uses ECG data from wearables to monitor heart activity, and interprets and explains the observed readings. We focus on atrial fibrillation (AF), the commonest type of arrhythmia. The absence of a P-wave in an ECG is the hallmark indication of AF. Four ML models are trained to classify an ECG signal based on the presence or absence of the P-wave: multilayer perceptron (MLP), logistic regression, support vector machine, and random forest. The MLP is the best performing model with an accuracy of 89.61% and an F1 score of 88.68%. A BN representing AF risk factors is developed based on expert knowledge from the literature and evaluated using Pitchforth and Mengersen’s validation framework. The P-wave presence or absence as determined by the ML model is input into the BN. The PHA is evaluated using sample use cases to illustrate how the BN can explain the occurrence of AF using diagnostic reasoning. This gives the most likely AF risk factors for the individual
@inbook{478, author = {Tezira Wanyana, Mbithe Nzomo, C. Sue Price, Deshen Moodley}, title = {Combining Machine Learning and Bayesian Networks for ECG Interpretation and Explanation}, abstract = {We explore how machine learning (ML) and Bayesian networks (BNs) can be combined in a personal health agent (PHA) for the detection and interpretation of electrocardiogram (ECG) characteristics. We propose a PHA that uses ECG data from wearables to monitor heart activity, and interprets and explains the observed readings. We focus on atrial fibrillation (AF), the commonest type of arrhythmia. The absence of a P-wave in an ECG is the hallmark indication of AF. Four ML models are trained to classify an ECG signal based on the presence or absence of the P-wave: multilayer perceptron (MLP), logistic regression, support vector machine, and random forest. The MLP is the best performing model with an accuracy of 89.61% and an F1 score of 88.68%. A BN representing AF risk factors is developed based on expert knowledge from the literature and evaluated using Pitchforth and Mengersen’s validation framework. The P-wave presence or absence as determined by the ML model is input into the BN. The PHA is evaluated using sample use cases to illustrate how the BN can explain the occurrence of AF using diagnostic reasoning. This gives the most likely AF risk factors for the individual}, year = {2022}, journal = {Proceedings of the 8th International Conference on Information and Communication Technologies for Ageing Well and e-Health - ICT4AWE}, pages = {81-92}, publisher = {SciTePress}, address = {INSTICC}, isbn = {978-989-758-566-1}, doi = {https://doi.org/10.5220/0011046100003188}, }
The abductive theory of method (ATOM) was recently proposed to describe the process that scientists use for knowledge discovery. In this paper we propose an agent architecture for knowledge discovery and evolution (KDE) based on ATOM. The agent incorporates a combination of ontologies, rules and Bayesian networks for representing different aspects of its internal knowledge. The agent uses an external AI service to detect unexpected situations from incoming observations. It then uses rules to analyse the current situation and a Bayesian network for finding plausible explanations for unexpected situations. The architecture is evaluated and analysed on a use case application for monitoring daily household electricity consumption patterns.
@inbook{425, author = {Tezira Wanyana, Deshen Moodley}, title = {An Agent Architecture for Knowledge Discovery and Evolution}, abstract = {The abductive theory of method (ATOM) was recently proposed to describe the process that scientists use for knowledge discovery. In this paper we propose an agent architecture for knowledge discovery and evolution (KDE) based on ATOM. The agent incorporates a combination of ontologies, rules and Bayesian networks for representing different aspects of its internal knowledge. The agent uses an external AI service to detect unexpected situations from incoming observations. It then uses rules to analyse the current situation and a Bayesian network for finding plausible explanations for unexpected situations. The architecture is evaluated and analysed on a use case application for monitoring daily household electricity consumption patterns.}, year = {2021}, journal = {KI 2021: Advances in Artificial Intelligence}, edition = {volume 12873}, pages = {241-256}, publisher = {Springer International Publishing}, address = {Cham}, isbn = {978-3-030-87626-5}, doi = {https://doi.org/10.1007/978-3-030-87626-5_18}, }
Knowledge Discovery and Evolution (KDE) is of interest to a broad array of researchers from both Philosophy of Science (PoS) and Artificial Intelligence (AI), in particular, Knowledge Representation and Reasoning (KR), Machine Learning and Data Mining (ML-DM) and the Agent Based Systems (ABS) communities. In PoS, Haig recently pro- posed a so-called broad theory of scientific method that uses abduction for generating theories to explain phenomena. He refers to this method of scientific inquiry as the Abductive Theory of Method (ATOM). In this paper, we analyse ATOM, align it with KR and ML-DM perspectives and propose an algorithm and an ontology for supporting agent based knowledge discovery and evolution based on ATOM. We illustrate the use of the algorithm and the ontology on a use case application for electricity consumption behaviour in residential households.
@{405, author = {Tezira Wanyana, Deshen Moodley, Tommie Meyer}, title = {An Ontology for Supporting Knowledge Discovery and Evolution}, abstract = {Knowledge Discovery and Evolution (KDE) is of interest to a broad array of researchers from both Philosophy of Science (PoS) and Artificial Intelligence (AI), in particular, Knowledge Representation and Reasoning (KR), Machine Learning and Data Mining (ML-DM) and the Agent Based Systems (ABS) communities. In PoS, Haig recently pro- posed a so-called broad theory of scientific method that uses abduction for generating theories to explain phenomena. He refers to this method of scientific inquiry as the Abductive Theory of Method (ATOM). In this paper, we analyse ATOM, align it with KR and ML-DM perspectives and propose an algorithm and an ontology for supporting agent based knowledge discovery and evolution based on ATOM. We illustrate the use of the algorithm and the ontology on a use case application for electricity consumption behaviour in residential households.}, year = {2020}, journal = {First Southern African Conference for Artificial Intelligence Research}, pages = {206-221}, month = {22/02/2021}, publisher = {SACAIR2020}, address = {Virtual}, isbn = {978-0-620-89373-2}, url = {https://2020.sacair.org.za/wp-content/uploads/2021/02/SACAIR_Proceedings-MainBook_Finv4_compressed.pdf?_ga=2.116601743.849395099.1621802506-572599210.1621419278}, }
2019-Current PhD (Humanities): 'A Critical Inquiry into the Metaphysics for Mind Uploading'.
Latest Research Publications: