Background and related work

3A-LLM — An Alternative Axiomatic Algebraic LLM

Research on semantic representation spans a wide range of disciplinary traditions, each offering a distinct perspective on how verbal expressions encode meaning.

Formal semantics, rooted in the work of Frege [Frege1892], Russell [Russell1905], Tarski [Tarski1944], Wittgenstein [Wittgenstein1953], van Benthem & ter Meulen [vanBenthem1997], and Guarino [Guarino1998] and many more, conceptualizes meaning through logical structures designed to preserve truth-conditional interpretation. Although this tradition has provided a rigorous framework for modeling inference, entailment, and compositionality, it faces challenges when accounting for conceptual nuances, cross-lingual variations, and the generative creativity of natural languages [ChierchiaMcConnellGinet2000].

Besides formal semantics, lexicographic research has emphasized the value of controlled defining vocabularies such as the Longman Defining Vocabulary (LDV), cf. [Fox2014]. These vocabularies are designed to maintain definitional consistency by using a restricted set of terms to define new lexical items [OgdenRichards1923], [GoddardWierzbicka1994]. The Longman tradition, and more broadly the lexicographic minimalism found in learner dictionaries, has demonstrated that a small, well-chosen set of defining primitives is sufficient to express a vast semantic space [Bloomfield1933].

In psycholinguistics, the view on semantics had changed during the 1990s. For example, in his famous book Speaking, Levelt [Levelt1989], following Kempen & Huijbers [Kempen1983], proposed that a lexical item consists of two parts, a lemma covering semantics and syntax, and a word form covering morphology and phonology. Ten years later [Levelt1999], the lemma only represents syntax anymore while the semantic aspect is covered by the lexical entry‘s links into the conceptual space. This also means that a language’s lexical items are linked to so-called “lexical concepts” which are linked to other concept nodes [Roelofs2018]. We will follow this idea insofar as we assume that meaning is mainly language independent, and we also adopt the assumption that concepts are linked to lexical items via lexical concepts.

There are many more disciplines that provide insights on the relation between verbal expressions and their meaning that have influenced the development of A-LLM. Cognitive semantics approaches meaning through the lens of human conceptualization, grounding linguistic expressions in embodied experience, prototype structure, and culturally and individually situated conceptual categories [Lakoff1987], [Rosch1975]. Cognitive semantics makes clear that semantics is different from text statistics since embodied experience is beyond such statistics. However, it proposes an individualistic sight on semantics. As A-LLM is supposed to support language processing applications automatically, it needs not only to abstract on shades of individual meanings but also to be formal and computational. It thus relies on insights from ontology engineering [Arp2015], [Baader2010], [Sowa2000] and from the development of lexical-semantic resources such as WordNet [Fellbaum1998], FrameNet [Fillmore2003], VerbNet [Kipper2008], and EuroWordNet [Vossen2004]. The Collaborative Interlingual Index (CILI) [Bond2016] aligns synsets across WordNets of different languages and thus supports cross-lingual conceptual comparison; A-LLM pursues a complementary strategy by grounding concepts in a single, language-independent primitive set (the LDV) and linking lexical items of arbitrary languages to that set. How this is done, we will present in the following.

Semantic search and retrieval benefit from explicit conceptual structure. Research on semantic representation spans formal semantics [ChierchiaMcConnellGinet2000], lexicographic traditions such as the Longman Defining Vocabulary (LDV) [Fox2014], [OgdenRichards1923], and lexical-semantic resources such as WordNet [Fellbaum1998], FrameNet [Fillmore2003], VerbNet [Kipper2008], and EuroWordNet [Vossen2004]. Formal semantics conceptualises meaning through logical structures and truth conditions [Frege1892]; controlled defining vocabularies maintain definitional consistency [OgdenRichards1923], [GoddardWierzbicka1994], [Bloomfield1933]. In psycholinguistics, lexical items are linked to a conceptual space via “lexical concepts” [Levelt1989], [Roelofs2018]; cognitive semantics grounds expressions in conceptualisation and prototype structure [Lakoff1987], [Rosch1975]. The Collaborative Interlingual Index (CILI) [Bond2016] aligns synsets across WordNets; 3A-LLM pursues a complementary strategy by grounding concepts in a single, language-independent primitive set (the LDV) and linking lexical items of any language to that set.

WordNet and similar resources provide structured relations among lexical items [Fellbaum1998], [Miller1995], [Vossen2004] but lack a uniform generative calculus for constructing new concepts. Ontology engineering and description logics offer constraint checking and subsumption [Baader2010], [Guarino1998] but usually presuppose a fixed conceptual inventory. 3A-LLM [Bense2024] grounds concepts in the LDV, applies vertical operators (e.g. noun, verb, adjective, hypernym, instrument) and horizontal operators (e.g. opposite, orthogonal), and composes concepts as unary functions a(b). The result is a directed, typed graph with definitional, vertical, and horizontal edges, supporting semantic radius, proximity, and full traceability. Unlike neural language models whose semantics are implicit in embeddings [Devlin2019], 3A-LLM provides explicit construction histories and deterministic expansion. For search, 3A-LLM can expand a query concept to related concepts (same FoC, compounds, instruments) and predict salient implicit concepts from short phrases, providing a semantic backbone for hybrid KG--LLM pipelines [Steels2008].

Extension: deriver.app

This chapter consolidates material from the allm LaTeX sources (main40.tex, main50.tex, main97.tex). In Deriver documentation, triples, rules, and the Workbench align with the explicit conceptual structure described here.

Source text: parallel project allm/ (LaTeX); HTML generated via taoke/tools/build-3allm-from-tex.php.