Research Publications
2019
Information Systems (IS) as a discipline is still young and is continuously involved in building its own research knowledge base. Design Science Research (DSR) in IS is a research strategy for design that has emerged in the last 16 years. IS researchers are often lost when they start with a project in DSR, especially young researchers. We identified a need for a set of guidelines with supporting reference literature that can assist such novice adopters of DSR. We identified major themes relevant to DSR and proposed a set of six guidelines for the novice researcher supported with references summaries of seminal works from the IS DSR literature. We believe that someone new to the field can use these guidelines to prepare him/herself to embark on a DSR study.
@{261, author = {Alta van der Merwe and Aurona Gerber and Hanlie Smuts}, title = {Guidelines for Conducting Design Science Research in Information Systems}, abstract = {Information Systems (IS) as a discipline is still young and is continuously involved in building its own research knowledge base. Design Science Research (DSR) in IS is a research strategy for design that has emerged in the last 16 years. IS researchers are often lost when they start with a project in DSR, especially young researchers. We identified a need for a set of guidelines with supporting reference literature that can assist such novice adopters of DSR. We identified major themes relevant to DSR and proposed a set of six guidelines for the novice researcher supported with references summaries of seminal works from the IS DSR literature. We believe that someone new to the field can use these guidelines to prepare him/herself to embark on a DSR study.}, year = {2019}, journal = {SACLA}, month = {15/07 - 17/07}, publisher = {Springer}, isbn = {978-3-030-35628-6}, doi = {10.1007/978-3-030-35629-3_11}, }
Digital disruption is the phenomenon when established businesses succumb to new business models that exploit emerging technologies. Futurists often make dire predictions when discussing the impact of digital disruption, for instance that 40% of the Fortune 500 companies will disappear within the next decade. The digital disruption phenomenon was already studied two decades ago when Clayton Christensen developed a Theory of Disruptive Innovation, which is a popular theory for describing and explaining disruption due to technology developments that had occurred in the past. However it is still problematic to understand what is necessary to avoid disruption, especially within the context of a sustainable society in the 21st century. A key aspect we identified is the behavior of non-mainstream customers of an emerging technology, which is difficult to predict, especially when an organization is operating in an existing solution space. In this position paper we propose complementing the Theory of Disruptive Innovation with design thinking in order to identify the performance attributes that encourage the unpredictable and unforeseen customer behavior that is a cause for disruption. We employ case-based scenario analysis of higher education as evaluation mechanism for our extended disruptive innovation theory. Our position is that a better understanding of the implicit and unpredictable customer behavior that cause disruption due to additional performance attributes (using design thinking) could assist organizations to pre-empt digital disruption and adapt to support the additional functionality.
@{259, author = {Aurona Gerber and Machdel Matthee}, title = {Design Thinking for Pre-empting Digital Disruption}, abstract = {Digital disruption is the phenomenon when established businesses succumb to new business models that exploit emerging technologies. Futurists often make dire predictions when discussing the impact of digital disruption, for instance that 40% of the Fortune 500 companies will disappear within the next decade. The digital disruption phenomenon was already studied two decades ago when Clayton Christensen developed a Theory of Disruptive Innovation, which is a popular theory for describing and explaining disruption due to technology developments that had occurred in the past. However it is still problematic to understand what is necessary to avoid disruption, especially within the context of a sustainable society in the 21st century. A key aspect we identified is the behavior of non-mainstream customers of an emerging technology, which is difficult to predict, especially when an organization is operating in an existing solution space. In this position paper we propose complementing the Theory of Disruptive Innovation with design thinking in order to identify the performance attributes that encourage the unpredictable and unforeseen customer behavior that is a cause for disruption. We employ case-based scenario analysis of higher education as evaluation mechanism for our extended disruptive innovation theory. Our position is that a better understanding of the implicit and unpredictable customer behavior that cause disruption due to additional performance attributes (using design thinking) could assist organizations to pre-empt digital disruption and adapt to support the additional functionality.}, year = {2019}, journal = {Conference on e-Business, e-Services and e-Society}, pages = {759 - 770}, month = {18/09 - 20/09}, publisher = {Springer}, isbn = {978-3-030-29373-4}, doi = {10.1007/978-3-030-29374-1_62}, }
Advanced modeling is a challenging endeavor and good tool support is of paramount importance to ensure that the modeling objectives are met through the efficient execution of tasks. Tools for advanced modeling should not just support basic task modeling functionality such as easy-to-use interfaces for model creation, but also advanced task functionality such as consistency checks and analysis queries. Enterprise Architecture (EA) is concerned with the alignment of all aspects of an organization. Modeling plays a crucial role in EA and the matching of the correct tool to enable task execution is vital for enterprises engaged with EA. Enterprise Architecture Management (EAM) reflects recent trends that elevate EA toward a strategic management function within organizations. Tool support for EAM would necessarily include the execution of additional and often implicit advanced modeling tasks that support EAM capabilities. In this paper we report on a study that used the Task-Technology Fit (TTF) theory to investigate the extent to which basic and advanced task execution for EAM is supported by technology. We found that four of the six TTF factors fully supported and one partially supported EAM task execution. One factor was inconclusive. This study provided a insight into investigating tool support for EAM related task execution to achieve strategic EAM goals.
@inbook{258, author = {Sunet Eybers and Aurona Gerber and Dominik Bork and Dimitris Karagiannis}, title = {Matching Technology with Enterprise Architecture and Enterprise Architecture Management Tasks Using Task Technology Fit}, abstract = {Advanced modeling is a challenging endeavor and good tool support is of paramount importance to ensure that the modeling objectives are met through the efficient execution of tasks. Tools for advanced modeling should not just support basic task modeling functionality such as easy-to-use interfaces for model creation, but also advanced task functionality such as consistency checks and analysis queries. Enterprise Architecture (EA) is concerned with the alignment of all aspects of an organization. Modeling plays a crucial role in EA and the matching of the correct tool to enable task execution is vital for enterprises engaged with EA. Enterprise Architecture Management (EAM) reflects recent trends that elevate EA toward a strategic management function within organizations. Tool support for EAM would necessarily include the execution of additional and often implicit advanced modeling tasks that support EAM capabilities. In this paper we report on a study that used the Task-Technology Fit (TTF) theory to investigate the extent to which basic and advanced task execution for EAM is supported by technology. We found that four of the six TTF factors fully supported and one partially supported EAM task execution. One factor was inconclusive. This study provided a insight into investigating tool support for EAM related task execution to achieve strategic EAM goals.}, year = {2019}, journal = {Lecture Notes in Business Information Processing}, pages = {245 - 260}, publisher = {Springer}, isbn = {978-3-030-20617-8}, doi = {10.1007/978-3-030-20618-5_17}, }
Visual languages make use of spatial arrangements of graphical and textual elements to represent information. Domain specific diagrams, including flowcharts and music sheets, are examples of visual languages. An established area of research is the study of languages which can be used to create declarative specifications of visual languages. In this paper, the result of a review of research on visual language specification languages is presented. Specifically, a structured literature review is conducted to establish research themes by analysing what has been studied in the context of specification languages. The result of the literature review is used to develop a conceptual framework that consists of six research themes with related topics. Additionally, discussions on how the conceptual framework can be used as a basis to guide research in the field of specification languages, to perform feature based characterisations and to create lists of criteria to evaluate and compare specification languages are included in this paper.
@{255, author = {Anitta Thomas and Aurona Gerber and Alta van der Merwe}, title = {A Conceptual Framework of Research on Visual Language Specification Languages}, abstract = {Visual languages make use of spatial arrangements of graphical and textual elements to represent information. Domain specific diagrams, including flowcharts and music sheets, are examples of visual languages. An established area of research is the study of languages which can be used to create declarative specifications of visual languages. In this paper, the result of a review of research on visual language specification languages is presented. Specifically, a structured literature review is conducted to establish research themes by analysing what has been studied in the context of specification languages. The result of the literature review is used to develop a conceptual framework that consists of six research themes with related topics. Additionally, discussions on how the conceptual framework can be used as a basis to guide research in the field of specification languages, to perform feature based characterisations and to create lists of criteria to evaluate and compare specification languages are included in this paper.}, year = {2019}, journal = {International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD)}, month = {05/09 - 06/09}, publisher = {IEEE}, address = {Winterton, South Africa}, isbn = {978-1-5386-9236-3}, url = {https://ieeexplore.ieee.org/document/8851003}, doi = {10.1109/ICABCD.2019.8851003}, }
Many posterior distributions take intractable forms and thus require variational inference where analytical solutions cannot be found. Variational Inference and Monte Carlo Markov Chains (MCMC) are established mechanism to approximate these intractable values. An alternative approach to sampling and optimisation for approximation is a direct mapping between the data and posterior distribution. This is made possible by recent advances in deep learning methods. Latent Dirichlet Allocation (LDA) is a model which offers an intractable posterior of this nature. In LDA latent topics are learnt over unlabelled documents to soft cluster the documents. This paper assesses the viability of learning latent topics leveraging an autoencoder (in the form of Autoencoding variational Bayes) and compares the mimicked posterior distributions to that achieved by VI. After conducting various experiments the proposed AEVB delivers inadequate performance. Under Utopian conditions comparable conclusion are achieved which are generally unattainable. Further, model specification becomes increasingly complex and deeply circumstantially dependant - which is in itself not a deterrent but does warrant consideration. In a recent study, these concerns were highlighted and discussed theoretically. We confirm the argument empirically by dissecting the autoencoder’s iterative process. In investigating the autoencoder, we see performance degrade as models grow in dimensionality. Visualization of the autoencoder reveals a bias towards the initial randomised topics.
@{254, author = {Zach Wolpe and Alta de Waal}, title = {Autoencoding variational Bayes for latent Dirichlet allocation}, abstract = {Many posterior distributions take intractable forms and thus require variational inference where analytical solutions cannot be found. Variational Inference and Monte Carlo Markov Chains (MCMC) are established mechanism to approximate these intractable values. An alternative approach to sampling and optimisation for approximation is a direct mapping between the data and posterior distribution. This is made possible by recent advances in deep learning methods. Latent Dirichlet Allocation (LDA) is a model which offers an intractable posterior of this nature. In LDA latent topics are learnt over unlabelled documents to soft cluster the documents. This paper assesses the viability of learning latent topics leveraging an autoencoder (in the form of Autoencoding variational Bayes) and compares the mimicked posterior distributions to that achieved by VI. After conducting various experiments the proposed AEVB delivers inadequate performance. Under Utopian conditions comparable conclusion are achieved which are generally unattainable. Further, model specification becomes increasingly complex and deeply circumstantially dependant - which is in itself not a deterrent but does warrant consideration. In a recent study, these concerns were highlighted and discussed theoretically. We confirm the argument empirically by dissecting the autoencoder’s iterative process. In investigating the autoencoder, we see performance degrade as models grow in dimensionality. Visualization of the autoencoder reveals a bias towards the initial randomised topics.}, year = {2019}, journal = {Proceedings of the South African Forum for Artificial Intelligence Research}, pages = {25-36}, month = {12/09}, publisher = {CEUR Workshop Proceedings}, isbn = {1613-0073}, url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_33.pdf}, }
Environmental information is acquired and assessed during the environmental impact assessment process for surface‐strip coal mine approval. However, integrating these data and quantifying rehabilitation risk using a holistic multidisciplinary approach is seldom undertaken. We present a rehabilitation risk assessment integrated network (R2AIN™) framework that can be applied using Bayesian networks (BNs) to integrate and quantify such rehabilitation risks. Our framework has 7 steps, including key integration of rehabilitation risk sources and the quantification of undesired rehabilitation risk events to the final application of mitigation. We demonstrate the framework using a soil compaction BN case study in the Witbank Coalfield, South Africa and the Bowen Basin, Australia. Our approach allows for a probabilistic assessment of rehabilitation risk associated with multidisciplines to be integrated and quantified. Using this method, a site's rehabilitation risk profile can be determined before mining activities commence and the effects of manipulating management actions during later mine phases to reduce risk can be gauged, to aid decision making
@article{253, author = {Vanessa Weyer and Alta de Waal and Alex Lechner and Corinne Unger and Tim O'Connor and Thomas Baumgartl and Roland Schulze and Wayne Truter}, title = {Quantifying rehabilitation risks for surface‐strip coal mines using a soil compaction Bayesian network in South Africa and Australia: To demonstrate the R2AIN Framework}, abstract = {Environmental information is acquired and assessed during the environmental impact assessment process for surface‐strip coal mine approval. However, integrating these data and quantifying rehabilitation risk using a holistic multidisciplinary approach is seldom undertaken. We present a rehabilitation risk assessment integrated network (R2AIN™) framework that can be applied using Bayesian networks (BNs) to integrate and quantify such rehabilitation risks. Our framework has 7 steps, including key integration of rehabilitation risk sources and the quantification of undesired rehabilitation risk events to the final application of mitigation. We demonstrate the framework using a soil compaction BN case study in the Witbank Coalfield, South Africa and the Bowen Basin, Australia. Our approach allows for a probabilistic assessment of rehabilitation risk associated with multidisciplines to be integrated and quantified. Using this method, a site's rehabilitation risk profile can be determined before mining activities commence and the effects of manipulating management actions during later mine phases to reduce risk can be gauged, to aid decision making}, year = {2019}, journal = {Integrated Environmental Assessment and Management}, volume = {15}, pages = {190-208}, issue = {2}, publisher = {Wiley Online}, doi = {10.1002/ieam.4128}, }
This work compares techniques for clustering metered residential energy consumption data to construct representative daily load profiles in South Africa. The input data captures a population with high variability across temporal, geographic, social and economic dimensions. Different algorithms, normalisation and pre-binning techniques are evaluated to determine their effect on producing a good clustering structure. A Combined Index is developed as a relative score to ease the comparison of experiments across different metrics. The study shows that normalisation, specifically unit norm and the zero-one scaler, produce the best clusters. Pre-binning appears to improve clustering structures as a whole, but its effect on individual experiments remains unclear. Like several previous studies, the k-means algorithm produces the best results. To our knowledge this is the first work that rigorously compares state of the art cluster analysis techniques in the residential energy domain in a developing country context.
@{249, author = {Wiebke Toussaint and Deshen Moodley}, title = {Comparison of clustering techniques for residential load profiles in South Africa}, abstract = {This work compares techniques for clustering metered residential energy consumption data to construct representative daily load profiles in South Africa. The input data captures a population with high variability across temporal, geographic, social and economic dimensions. Different algorithms, normalisation and pre-binning techniques are evaluated to determine their effect on producing a good clustering structure. A Combined Index is developed as a relative score to ease the comparison of experiments across different metrics. The study shows that normalisation, specifically unit norm and the zero-one scaler, produce the best clusters. Pre-binning appears to improve clustering structures as a whole, but its effect on individual experiments remains unclear. Like several previous studies, the k-means algorithm produces the best results. To our knowledge this is the first work that rigorously compares state of the art cluster analysis techniques in the residential energy domain in a developing country context.}, year = {2019}, journal = {Forum for Artificial Intelligence Research}, pages = {117 -132}, month = {03/12 - 06/12}, publisher = {CEUR}, isbn = {1613-0073}, url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_55.pdf}, }
In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.
@{247, author = {Katarina Britz and Ivan Varzinczak}, title = {Preferential tableaux for contextual defeasible ALC}, abstract = {In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.}, year = {2019}, journal = {28th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX)}, pages = {39-57}, month = {03/09-05/09}, publisher = {Springer LNAI no. 11714}, isbn = {ISBN 978-3-030-29026-9}, url = {https://www.springer.com/gp/book/9783030290252}, }
Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.
@article{246, author = {Katarina Britz and Ivan Varzinczak}, title = {Contextual rational closure for defeasible ALC}, abstract = {Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.}, year = {2019}, journal = {Annals of Mathematics and Artificial Intelligence}, volume = {87}, pages = {83-108}, issue = {1-2}, isbn = {ISSN: 1012-2443}, url = {https://link.springer.com/article/10.1007/s10472-019-09658-2}, doi = {10.1007/s10472-019-09658-2}, }
A dynamic Bayesian decision network was developed to model the preharvest burning decision-making processes of sugarcane growers in a KwaZulu-Natal sugarcane supply chain and extends previous work by Price et al. (2018). This model was created using an iterative development approach. This paper recounts the development and validation process of the third version of the model. The model was validated using Pitchforth and Mengersen (2013)’s framework for validating expert elicited Bayesian networks. During this process, growers and cane supply members assessed the model in a focus group by executing the model, and reviewing the results of a prerun scenario. The participants were generally positive about how the model represented their decision-making processes. However, they identified some issues that could be addressed in the next iteration. Dynamic Bayesian decision networks offer a promising approach to modelling adaptive decisions in uncertain conditions. This model can be used to simulate the cognitive mechanism for a grower agent in a simulation of a sugarcane supply chain.
@{244, author = {C. Sue Price and Deshen Moodley and Anban Pillay}, title = {Modelling uncertain adaptive decisions: Application to KwaZulu-Natal sugarcane growers}, abstract = {A dynamic Bayesian decision network was developed to model the preharvest burning decision-making processes of sugarcane growers in a KwaZulu-Natal sugarcane supply chain and extends previous work by Price et al. (2018). This model was created using an iterative development approach. This paper recounts the development and validation process of the third version of the model. The model was validated using Pitchforth and Mengersen (2013)’s framework for validating expert elicited Bayesian networks. During this process, growers and cane supply members assessed the model in a focus group by executing the model, and reviewing the results of a prerun scenario. The participants were generally positive about how the model represented their decision-making processes. However, they identified some issues that could be addressed in the next iteration. Dynamic Bayesian decision networks offer a promising approach to modelling adaptive decisions in uncertain conditions. This model can be used to simulate the cognitive mechanism for a grower agent in a simulation of a sugarcane supply chain.}, year = {2019}, journal = {Forum for Artificial Intelligence Research (FAIR2019)}, pages = {145-160}, month = {4/12-6/12}, publisher = {CEUR}, address = {Cape Town}, url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_53.pdf}, }
The Cold-Start problem refers to the initial sparsity of data available to Recommender Systems that leads to poor recommendations to users. This research compares a Deep Learning Approach, a Deep Learning Approach that makes use of social information and Matrix Factorization. The social information was used to form communities of users. The intuition behind this approach is that users within a given community are likely to have similar interests. A community detection algorithm was used to group users. Thereafter a deep learning model was trained on each community. The comparative models were evaluated on the Yelp Round 9 Academic Dataset. The dataset was pruned to consist only of users with at least 1 social link. The evaluation metrics used were Mean Squared Error (MSE) and Mean Absolute Error (MAE). The evaluation was carried out using 5-fold cross-validation. The results showed that the use of social information improved on the results achieved from the Deep Learning Approach, and grouping users into communities was advantageous. However, the Deep Learning Approach that made use of social information did not outperform SVD++, a state of the art approach for recommender systems. However, the new approach shows promise for improving Deep Learning models.
@{243, author = {Muhammad Ikram and Anban Pillay and Edgar Jembere}, title = {Using social networks to enhance a deep learning approach to solve the cold-start problem in recommender systems}, abstract = {The Cold-Start problem refers to the initial sparsity of data available to Recommender Systems that leads to poor recommendations to users. This research compares a Deep Learning Approach, a Deep Learning Approach that makes use of social information and Matrix Factorization. The social information was used to form communities of users. The intuition behind this approach is that users within a given community are likely to have similar interests. A community detection algorithm was used to group users. Thereafter a deep learning model was trained on each community. The comparative models were evaluated on the Yelp Round 9 Academic Dataset. The dataset was pruned to consist only of users with at least 1 social link. The evaluation metrics used were Mean Squared Error (MSE) and Mean Absolute Error (MAE). The evaluation was carried out using 5-fold cross-validation. The results showed that the use of social information improved on the results achieved from the Deep Learning Approach, and grouping users into communities was advantageous. However, the Deep Learning Approach that made use of social information did not outperform SVD++, a state of the art approach for recommender systems. However, the new approach shows promise for improving Deep Learning models.}, year = {2019}, journal = {Forum for Artificial Intelligence Research (FAIR2019)}, pages = {173-184}, month = {4/12-6/12}, publisher = {CEUR}, address = {Cape Town}, url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_51.pdf}, }
Training agents in hard exploration, sparse reward environments is a difficult task since the reward feedback is insufficient for meaningful learning. In this work, we propose a new technique, called Directed Curiosity, that is a hybrid of Curiosity-Driven Exploration and distancebased reward shaping. The technique is evaluated in a custom navigation task where an agent tries to learn the shortest path to a distant target, in environments of varying difficulty. The technique is compared to agents trained with only a shaped reward signal, a curiosity signal as well as a sparse reward signal. It is shown that directed curiosity is the most successful in hard exploration environments, with the benefits of the approach being highlighted in environments with numerous obstacles and decision points. The limitations of the shaped reward function are also discussed.
@{242, author = {Asad Jeewa and Anban Pillay and Edgar Jembere}, title = {Directed curiosity-driven exploration in hard exploration, sparse reward environments}, abstract = {Training agents in hard exploration, sparse reward environments is a difficult task since the reward feedback is insufficient for meaningful learning. In this work, we propose a new technique, called Directed Curiosity, that is a hybrid of Curiosity-Driven Exploration and distancebased reward shaping. The technique is evaluated in a custom navigation task where an agent tries to learn the shortest path to a distant target, in environments of varying difficulty. The technique is compared to agents trained with only a shaped reward signal, a curiosity signal as well as a sparse reward signal. It is shown that directed curiosity is the most successful in hard exploration environments, with the benefits of the approach being highlighted in environments with numerous obstacles and decision points. The limitations of the shaped reward function are also discussed.}, year = {2019}, journal = {Forum for Artificial Intelligence Research (FAIR)}, pages = {12 -24}, month = {4/12-6/12}, publisher = {CEUR}, address = {Cape Town}, url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_42.pdf}, }
In this paper we present an approach to defeasible reasoning for the description logic ALC. The results discussed here are based on work done by Kraus, Lehmann and Magidor (KLM) on defeasible conditionals in the propositional case. We consider versions of a preferential semantics for two forms of defeasible subsumption, and link these semantic constructions formally to KLM-style syntactic properties via representation results. In addition to showing that the semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. With the semantics of the defeasible version of ALC in place, we turn to the investigation of an appropriate form of defeasible entailment for this enriched version of ALC. This investigation includes an algorithm for the computation of a form of defeasible entailment known as rational closure in the propositional case. Importantly, the algorithm relies completely on classical entailment checks and shows that the computational complexity of reasoning over defeasible ontologies is no worse than that of the underlying classical ALC. Before concluding, we take a brief tour of some existing work on defeasible extensions of ALC that go beyond defeasible subsumption.
@inbook{240, author = {Katarina Britz and Giovanni Casini and Tommie Meyer and Ivan Varzinczak}, title = {A KLM Perspective on Defeasible Reasoning for Description Logics}, abstract = {In this paper we present an approach to defeasible reasoning for the description logic ALC. The results discussed here are based on work done by Kraus, Lehmann and Magidor (KLM) on defeasible conditionals in the propositional case. We consider versions of a preferential semantics for two forms of defeasible subsumption, and link these semantic constructions formally to KLM-style syntactic properties via representation results. In addition to showing that the semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. With the semantics of the defeasible version of ALC in place, we turn to the investigation of an appropriate form of defeasible entailment for this enriched version of ALC. This investigation includes an algorithm for the computation of a form of defeasible entailment known as rational closure in the propositional case. Importantly, the algorithm relies completely on classical entailment checks and shows that the computational complexity of reasoning over defeasible ontologies is no worse than that of the underlying classical ALC. Before concluding, we take a brief tour of some existing work on defeasible extensions of ALC that go beyond defeasible subsumption.}, year = {2019}, journal = {Description Logic, Theory Combination, and All That}, pages = {147–173}, publisher = {Springer}, address = {Switzerland}, isbn = {978-3-030-22101-0}, url = {https://link.springer.com/book/10.1007%2F978-3-030-22102-7}, doi = {https://doi.org/10.1007/978-3-030-22102-7 _ 7}, }
We present a systematic approach for extending the KLM framework for defeasible entailment. We first present a class of basic defeasible entailment relations, characterise it in three distinct ways and provide a high-level algorithm for computing it. This framework is then refined, with the refined version being characterised in a similar manner. We show that the two well-known forms of defeasible entailment, rational closure and lexicographic closure, fall within our refined framework, that rational closure is the most conservative of the defeasible entailment relations within the framework (with respect to subset inclusion), but that there are forms of defeasible entailment within our framework that are more “adventurous” than lexicographic closure.
@{238, author = {Giovanni Casini and Tommie Meyer and Ivan Varzinczak}, title = {Taking Defeasible Entailment Beyond Rational Closure}, abstract = {We present a systematic approach for extending the KLM framework for defeasible entailment. We first present a class of basic defeasible entailment relations, characterise it in three distinct ways and provide a high-level algorithm for computing it. This framework is then refined, with the refined version being characterised in a similar manner. We show that the two well-known forms of defeasible entailment, rational closure and lexicographic closure, fall within our refined framework, that rational closure is the most conservative of the defeasible entailment relations within the framework (with respect to subset inclusion), but that there are forms of defeasible entailment within our framework that are more “adventurous” than lexicographic closure.}, year = {2019}, journal = {European Conference on Logics in Artificial Intelligence}, pages = {182 - 197}, month = {07/05 - 11/05}, publisher = {Springer}, address = {Switzerland}, isbn = {978-3-030-19569-4}, url = {https://link.springer.com/chapter/10.1007%2F978-3-030-19570-0_12}, doi = {https://doi.org/10.1007/978-3-030-19570-0 _ 12}, }
Description logics (DLs) are well-known knowledge representation formalisms focused on the representation of terminological knowledge. A probabilistic extension of a light-weight DL was recently proposed for dealing with certain knowledge occurring in uncertain contexts. In this paper, we continue that line of research by introducing the Bayesian extension BALC of the DL ALC. We present a tableau based procedure for deciding consistency, and adapt it to solve other probabilistic, contextual, and general inferences in this logic. We also show that all these problems remain ExpTime-complete, the same as reasoning in the underlying classical ALC.
@{237, author = {Leonard Botha and Tommie Meyer and Rafael Peñaloza}, title = {A Bayesian Extension of the Description Logic ALC}, abstract = {Description logics (DLs) are well-known knowledge representation formalisms focused on the representation of terminological knowledge. A probabilistic extension of a light-weight DL was recently proposed for dealing with certain knowledge occurring in uncertain contexts. In this paper, we continue that line of research by introducing the Bayesian extension BALC of the DL ALC. We present a tableau based procedure for deciding consistency, and adapt it to solve other probabilistic, contextual, and general inferences in this logic. We also show that all these problems remain ExpTime-complete, the same as reasoning in the underlying classical ALC.}, year = {2019}, journal = {European Conference on Logics in Artificial Intelligence}, pages = {339 - 354}, month = {07/05 - 11/05}, publisher = {Springer}, address = {Switzerland}, isbn = {978-3-030-19569-4}, url = {https://link.springer.com/chapter/10.1007%2F978-3-030-19570-0_22}, doi = {https://doi.org/10.1007/978-3-030-19570-0 _ 22}, }
The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.
@{284, author = {Marthinus Theunissen and Marelie Davel and Etienne Barnard}, title = {Insights regarding overfitting on noise in deep learning}, abstract = {The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {49-63}, address = {Cape Town, South Africa}, }
The generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and ReLU activations relate to the way that class-distinctive information is propagated through a network.
@{279, author = {Arnold Pretorius and Etienne Barnard and Marelie Davel}, title = {ReLU and sigmoidal activation functions}, abstract = {The generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and ReLU activations relate to the way that class-distinctive information is propagated through a network.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {37-48}, month = {04/12-07/12}, publisher = {CEUR Workshop Proceedings}, address = {Cape Town, South Africa}, }
Geomagnetic storms are multi-day events characterised by significant perturbations to the magnetic field of the Earth, driven by solar activity. Numerous efforts have been undertaken to utilise in-situ measurements of the solar wind plasma to predict perturbations to the geomagnetic field measured on the ground. Typically, solar wind measurements are used as input parameters to a regression problem tasked with predicting a perturbation index such as the 1-minute cadence symmetric-H (Sym-H) index. We re-visit this problem, with two important twists: (i) An adapted feedforward neural network topology is designed to enable the pairwise analysis of input parameter weights. This enables the ranking of input parameters in terms of importance to output accuracy, without the need to train numerous models. (ii) Geomagnetic storm phase information is incorporated as model inputs and shown to increase performance. This is motivated by the fact that different physical phenomena are at play during different phases of a geomagnetic storm.
@{283, author = {Stefan Lotz and Jacques Beukes and Marelie Davel}, title = {Input parameter ranking for neural networks in a space weather regression problem}, abstract = {Geomagnetic storms are multi-day events characterised by significant perturbations to the magnetic field of the Earth, driven by solar activity. Numerous efforts have been undertaken to utilise in-situ measurements of the solar wind plasma to predict perturbations to the geomagnetic field measured on the ground. Typically, solar wind measurements are used as input parameters to a regression problem tasked with predicting a perturbation index such as the 1-minute cadence symmetric-H (Sym-H) index. We re-visit this problem, with two important twists: (i) An adapted feedforward neural network topology is designed to enable the pairwise analysis of input parameter weights. This enables the ranking of input parameters in terms of importance to output accuracy, without the need to train numerous models. (ii) Geomagnetic storm phase information is incorporated as model inputs and shown to increase performance. This is motivated by the fact that different physical phenomena are at play during different phases of a geomagnetic storm.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {133-144}, publisher = {CEUR workshop proceedings}, address = {Cape Town, South Africa}, }
Sequences are typically modelled with recurrent architectures, but growing research is finding convolutional architectures to also work well for sequence modelling [1]. We explore the performance of Temporal Convolutional Networks (TCNs) when applied to an important sequence modelling task: solar flare prediction. We take this approach, as our future goal is to apply techniques developed for probing and interpreting general convolutional neural networks (CNNs) to solar flare prediction.
@{282, author = {Dewald Krynauw and Marelie Davel and Stefan Lotz}, title = {Solar flare prediction with temporal convolutional networks (Work in progress)}, abstract = {Sequences are typically modelled with recurrent architectures, but growing research is finding convolutional architectures to also work well for sequence modelling [1]. We explore the performance of Temporal Convolutional Networks (TCNs) when applied to an important sequence modelling task: solar flare prediction. We take this approach, as our future goal is to apply techniques developed for probing and interpreting general convolutional neural networks (CNNs) to solar flare prediction.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {Work in progress}, publisher = {CEUR workshop proceedings}, isbn = {1613-0073}, }
No framework exists that can explain and predict the generalisation ability of DNNs in general circumstances. In fact, this question has not been addressed for some of the least complicated of neural network architectures: fully-connected feedforward networks with ReLU activations and a limited number of hidden layers. Building on recent work [2] that demonstrates the ability of individual nodes in a hidden layer to draw class-specific activation distributions apart, we show how a simplified network architecture can be analysed in terms of these activation distributions, and more specifically, the sample distances or activation gaps each node produces. We provide a theoretical perspective on the utility of viewing nodes as activation gap generators, and define the gap conditions that are guaranteed to result in perfect classification of a set of samples. We support these conclusions with empirical results.
@{230, author = {Marelie Davel}, title = {Activation gap generators in neural networks}, abstract = {No framework exists that can explain and predict the generalisation ability of DNNs in general circumstances. In fact, this question has not been addressed for some of the least complicated of neural network architectures: fully-connected feedforward networks with ReLU activations and a limited number of hidden layers. Building on recent work [2] that demonstrates the ability of individual nodes in a hidden layer to draw class-specific activation distributions apart, we show how a simplified network architecture can be analysed in terms of these activation distributions, and more specifically, the sample distances or activation gaps each node produces. We provide a theoretical perspective on the utility of viewing nodes as activation gap generators, and define the gap conditions that are guaranteed to result in perfect classification of a set of samples. We support these conclusions with empirical results.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {64-76}, month = {04/12-06/12/2019}, publisher = {CEUR workshop proceedings}, address = {Cape Town, South Africa}, }
ConceptCloud is a flexible interactive tool for exploring, vi- sualising, and analysing semi-structured data sets. It uses a combination of an intuitive tag cloud visualisation with an underlying concept lattice to provide a formal structure for navigation through a data set. Con- ceptCloud 2.0 extends the tool with an integrated map view to exploit the geolocation aspect of data. The tool’s implementation of exploratory search does not require prior knowledge of the structure of the data or compromise on scalability, and provides seamless navigation through the tag cloud and the map viewer.
@misc{227, author = {Tiaan Du Toit and Joshua Berndt and Katarina Britz and Bernd Fischer}, title = {ConceptCloud 2.0: Visualisation and exploration of geolocation-rich semi-structured data sets}, abstract = {ConceptCloud is a flexible interactive tool for exploring, vi- sualising, and analysing semi-structured data sets. It uses a combination of an intuitive tag cloud visualisation with an underlying concept lattice to provide a formal structure for navigation through a data set. Con- ceptCloud 2.0 extends the tool with an integrated map view to exploit the geolocation aspect of data. The tool’s implementation of exploratory search does not require prior knowledge of the structure of the data or compromise on scalability, and provides seamless navigation through the tag cloud and the map viewer.}, year = {2019}, journal = {ICFCA 2019 Conference and Workshops}, month = {06/2019}, publisher = {CEUR-WS}, isbn = {1613-0073}, url = {http://ceur-ws.org/Vol-2378/}, }
In this paper we introduce and investigate a very basic semantics for conditionals that can be used to define a broad class of conditional reasoning systems. We show that it encompasses the most popular kinds of conditional reasoning developed in logic-based KR. It turns out that the semantics we propose is appropriate for a structural analysis of those conditionals that do not satisfy the property of Right Weakening. We show that it can be used for the further development of an analysis of the notion of relevance in conditional reasoning.
@{226, author = {Giovanni Casini and Tommie Meyer and Ivan Varzinczak}, title = {Simple Conditionals with Constrained Right Weakening}, abstract = {In this paper we introduce and investigate a very basic semantics for conditionals that can be used to define a broad class of conditional reasoning systems. We show that it encompasses the most popular kinds of conditional reasoning developed in logic-based KR. It turns out that the semantics we propose is appropriate for a structural analysis of those conditionals that do not satisfy the property of Right Weakening. We show that it can be used for the further development of an analysis of the notion of relevance in conditional reasoning.}, year = {2019}, journal = {International Joint Conference on Artificial Intelligence}, pages = {1632-1638}, month = {10/08-16/08}, publisher = {International Joint Conferences on Artificial Intelligence}, isbn = {978-0-9992411-4-1}, url = {https://www.ijcai.org/Proceedings/2019/0226.pdf}, doi = {10.24963/ijcai.2019/226}, }
Datalog is a declarative logic programming language that uses classical logical reasoning as its basic form of reasoning. Defeasible reasoning is a form of non-classical reasoning that is able to deal with exceptions to general assertions in a formal manner. The KLM approach to defeasible reasoning is an axiomatic approach based on the concept of plausible inference. Since Datalog uses classical reasoning, it is currently not able to handle defeasible implications and exceptions. We aim to extend the expressivity of Datalog by incorporating KLM-style defeasible reasoning into classical Datalog. We present a systematic approach to extending the KLM properties and a well-known form of defeasible entailment: Rational Closure. We conclude by exploring Datalog extensions of less conservative forms of defeasible entailment: Relevant and Lexicographic Closure.
@{225, author = {Matthew Morris and Tala Ross and Tommie Meyer}, title = {Defeasible disjunctive datalog}, abstract = {Datalog is a declarative logic programming language that uses classical logical reasoning as its basic form of reasoning. Defeasible reasoning is a form of non-classical reasoning that is able to deal with exceptions to general assertions in a formal manner. The KLM approach to defeasible reasoning is an axiomatic approach based on the concept of plausible inference. Since Datalog uses classical reasoning, it is currently not able to handle defeasible implications and exceptions. We aim to extend the expressivity of Datalog by incorporating KLM-style defeasible reasoning into classical Datalog. We present a systematic approach to extending the KLM properties and a well-known form of defeasible entailment: Rational Closure. We conclude by exploring Datalog extensions of less conservative forms of defeasible entailment: Relevant and Lexicographic Closure.}, year = {2019}, journal = {Forum for Artificial Intelligence Research}, pages = {208-219}, month = {03/12-06/12}, publisher = {CEUR}, isbn = {1613-0073}, url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_38.pdf}, }
Datalog is a powerful language that can be used to represent explicit knowledge and compute inferences in knowledge bases. Datalog cannot represent or reason about contradictory rules, though. This is a limitation as contradictions are often present in domains that contain exceptions. In this paper, we extend datalog to represent contradictory and defeasible information. We define an approach to efficiently reason about contradictory information in datalog and show that it satisfies the KLM requirements for a rational consequence relation. Finally, we introduce an implementation of this approach in the form of a defeasible datalog reasoning tool and evaluate the performance of this tool.
@{224, author = {Michael Harrison and Tommie Meyer}, title = {Rational preferential reasoning for datalog}, abstract = {Datalog is a powerful language that can be used to represent explicit knowledge and compute inferences in knowledge bases. Datalog cannot represent or reason about contradictory rules, though. This is a limitation as contradictions are often present in domains that contain exceptions. In this paper, we extend datalog to represent contradictory and defeasible information. We define an approach to efficiently reason about contradictory information in datalog and show that it satisfies the KLM requirements for a rational consequence relation. Finally, we introduce an implementation of this approach in the form of a defeasible datalog reasoning tool and evaluate the performance of this tool.}, year = {2019}, journal = {Forum for Artificial Intelligence Research}, pages = {232-243}, month = {03/12-06/12}, publisher = {CEUR}, isbn = {1613-0073}, url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_67.pdf}, }