Building ontologies with BFO

CM Related Work

Building Ontologies with BFO

The discussion in this section is based on the book of R. Arp, B. Smith, A. Spear [ArSm2015].

Principles of Terminology

In this section, the authors discuss 25 different principles, which we will contrast with our principles. In the following the numbered bold face sentences and text passages in double quotes are direct citations. The object property is_a corresponds to our object properties <>subClassOf or <>is.

1. Include in the terminology terms used by influential groups of scientists for the most important types of entitites in the domain to be represented (Page 60). One can orient oneself to standards such as ISO / IEC 80000, to directories of scientific disciplines such as ICC and of course to top and upper ontologies and domain ontologies like Enargus provided by other scientists.

2. Strive to ensure maximal consensus with the terminological usage of scientists in the relevant discipline. This may well inprove working with domain experts, for instance in negotiating terminological compromises (Page 60). In practice, it often turns out that this can be a very time-consuming process in which certain terms can be wrestled with for hours. This is particularly the case when a knowledge domain is newly developed and no comparable projects have yet been carried out. However, in this treatise we show that there are a number of recommendations and restrictions that make it easier to define the terminology by excluding many alternatives.

3. Identify areas of disciplinary overlap where terminological usage is not consistent. Look for and keep track of synonyms for terms already in the terminology list from the areas. (Page 60) For similar concepts, the authors suggest defining mapping rules. This could be done, for example, via the object properties◊sameAs or◊EQ. Inconsistent terminology could be uncovered, for example, by deriving contradictions from inferences. Likewise, the domain experts would have to check the textual definitions for their semantic content. According to the state of the art, this cannot be fully automated.

4. Don't reinvent the wheel (Page 61). In term selection, stay as close as possible to the usage of actual domain experts. In terminology construction and ontology desing, make use of as many existing resources (terminologies and ontologies) as possible". In the Controlled Vocabulary of Concepts section, we show how one can define complex concepts starting from a Basic Vocabulary.

5. Use singular nouns (Page 61). If one were to deviate from this recommendation, it would be very easy to make statements that would be semantically incorrect, such as(communism, is_a, political systems).

6. Use lowercase italic format for common nouns (Page 62). This is a very syntactic aspect of naming. Different development environments for ontologies give different recommendations. We would therefore not want to make the recommendation so restrictive. In addition, only nouns are spoken of and not whether they are intended to denote classes or attributes or general concepts.In our opinion, the examples of proper names such asTom, Seattle and Jupiter are not an argument for supporting the recommendation. We go into more detail on this in the Naming Entities section.

7. Avoid Acronyms (Page 62). Surely the examples such asDNA andAIDS are understandable as clearly usable acronyms, with the exception of the recommendation. However, we use acronyms in connection with naming knowledge subjects to incorporate additional knowledge about the knowledge subject.

8. Associate each term in the ontology with a unique alphanumeric identifier (Page 63). In section Concept Numbering System we show how each term can be automatically assigned an identifier which, in the case of compound terms, even contains the identifiers of all sub-terms. In addition, assignments between the designations of the term in different languages can be made.

9. Ensure univocity of terms (Page 63). ... "The reason for insisting upon univocity in the context of ontology desing is quite straight-forward. If the same term is used in different ways in different contexts, then the humans involveld in ontology building are more likely to make errors". We show how the ambiguity of terms in the ontology can be avoided with different methods. This includes the use of special characters (qualtors) as term prefixes and the use of suffixes for the unambiguous characterization of terms in a similar way as is done in Wikipedia, for example. See also OBO Foundry Set Naming Guidelines.

10. Ensure univocity of relational expressions (Page 65).In this case, the authors address the multiple meaning of the is-a relation as a negative example. In the Fundamental Object Properties section, we define the sets of the most common object properties and their synonyms. There it is made clear that the subclass relationship◊is is clearly distinguished from the instance relationship◊iof or◊isInstanceOf.

11. Avoid mass nouns (Page 65). Count nouns are those where the particulars can be counted from, such as cat, person, atom, etc. In contrast, mass nouns are those such as blood, water, flesh, or chemical substance. So the latter usually refer to an indefinite amount of material. It is perfectly fine to ask how much water there is in a container, but there is little point in asking how much waters there are. Basically, you shouldn't use mass nouns alone, but only in connection with prefixes like a lot of or a portion of.

12. Distinguish the general from the particular (Page 67). As an example, the authors mention the two sentences theteapot is a device for pouring tea andJohn's teapot has been stolen. In the first sentence, teapot means a class, of course, and in the second, an individual instance. In our approach, the distinction between the two would simply be based on the naming. The^Teapot class has the ^ prefix, while a unique identifier like>Johns_Teapot would have to be created for the individual instance. In this respect, this problem is solved when converting the sentences in natural language into assignments of the ontology.

13. Provide all nonroot terms with definitions (Page 68). The way the authors understand the definition of terms is essentially the same as the textual definitions and the Words Sense Definitions that we have discussed. As an example for a textual definition they give: "X is a triangle = def. X is closed figure; X has exactly three sides; each of X´s sides is straight; X lies in a plane". In section Pythagoras Theorem we show how definitions in the case of the right triangle can be modeled in ontological detail, including the associated formulas. The authors leave open what they mean by a root term and why this type of definition should not be applied to root terms. We define root classes formally in section Root Classes.

14. Use Aristotelian definitions (Page 69). Primarily, the authors refer to the application of the Genus Differentiae Pattern. "The Aristolean definitional structure represents a basic format for the formulation of definitions that can be used regardless of ontological domain, and that is inherently directed at representing the position of each defined term within the relevant is_a hierarchy", In section Concept Composition we show how such definitions can be modeled as Concept Binary Trees. These not only represent an is_a hierarchy, but also use the Genus differentiae pattern as a construction principle.

15. Use essential features in defining terms (Page 70). ... The essential features of a thing are those features without which the thing would not be the type of thing that it is". As a method for identifying essential features, the authors formulate the method of removing features from a definition and then checking whether the thing can still be considered a typical thing of its kind. For example, the essential feature of a chair is that a person can sit on it, not its color or the type of material it is made of. In the sections Longman Defining Vocabulary and Natural Semantic Metalanguage we show examples of definitions that also contain non-essential features or where several alternative and independent definitions can be used to the meaning of a term. It certainly makes sense to highlight the definitions that only manage with essential features. However, the other definitions can be just as important to model the entire concept intentionality and extensionality, so that questions about concepts can also be answered that could otherwise only be answered with additional knowledge from the context.

17. Start with the most general terms in your domain (Page 71). You can certainly take that point of view. In our opinion, however, it is not a must. Basically, the authors in this section also argue that the creation of a class hierarchy is an iterative process. However, it could also turn out during modeling that the most general concepts may not be needed at all and rather lead to a lack of clarity or unnecessary complexity of the models. It may therefore be better to assume which use cases need to be covered and only use the concepts that are needed in the use cases.

17. Avoid circularity in defining terms (Page 72). ... "Since definitions are intended to explain the meaning of a term to someone who does not already understand it, using the term itself or some very similar expression in its own definitions defeats the purpose of providing a definition in the first place". The definition of document asThe Document class represents those things which are, broadly conceived, 'documents' clearly exhibits circularity.

18. To ensure the Intelligibility of definitions, use simpler terms than the term you are defining (Page 73). ... "The terms used in a definition should be more intelligible ... than the term that is being defined". One of the best examples of this is the Longmans Dictionary. All of the approximately 230,000 word definitions are derived from the approximately 2,000 words of the Longman Defining Vocabulary. In addition, in our approach we have reduced the approx. 2,000 words of the LDV to the approx. 450 words of the Natural Semantic Metalanguage. Of course, technical terms from the various domains can also be modeled using the same construction principle, such as sulfuric acid, hydrogen sulfide, etc. Thus a kind of Deep Semantic Seach (DSS) can be enabled.

19. Do not create terms for universals through logical combination. (Page 74). The authors claim that "Ontology is not analogous to set theory". As an example, they cite that if u and v are universals, it cannot be inferred thatu and v oru or v are also universals. This is of course correct, but we think that the minority of modelers assume that this could be the case. Also their statement "Avoid postulating complements of classes as entities in an ontology" may work for the example dog and nondog, but for other concepts it makes sense like forFemale andnon-Female / ¬Female as we show in section DL Related Work and forWoman ≡ Person ⊓ Female in section Defining DL Axioms with Reification.

"The recommendation to avoid negative terms thus needs to be applied with care" (Page 75). As an example the definition of a nonsmoker is given as "nonsmoker = def. a human being who does not smoke." Also the authors argue that other putatively negative terms like odorless, colorless, invisible or unfriendly are similarly admissible, since they can be defined in a positive way in terms of lacks. We would e.g. model as concept binary treesunfriendly = (not, friendly) orcolorless = (without, color) and WSD(water) = (liquid, (without, color)).

21. Structure every ontology around a backbone is_a hierarchy (Page 76). We think that such a backbone hierarchy is inherent in correct modeling. The data properties of the classes can be used as support. The individual instance of a class can have all data properties of the class and all of its superclasses. Part of the modeling method for class hierarchies is then to decide whether further intermediate or subclasses should be introduced. The correctness of the modeling can then usually be easily checked by tracing the form of a data property upwards via the inheritance hierarchy. The definition for the data property must then be found on the path in the class or one of its superclasses.

This also applies in a similar way to the modeling of part-whole relationships. Here, however, care must be taken to use semantically correct object properties such as◊PartOf or◊MemberOf. The part-whole hierarchy (partonomy) results from the hierarchy of these relationships.

22. Ensure is_a completeness. (Page 77). When inserting a new class, check at which point in an is_a hierarchy it is to be inserted. In addition, the authors recommend storing the previously described Genus-Differentiae definitions for each new class. "On the other hand if is_a completeness is satisfied, then the creation of Aristotelian definitions is itself more straightforwared". ... Then on page 78 the authors give a couple of examples for: "Bad practice in terminologies often involves the mixture of ontological categories acroass is_a relations". As a method for checking such incorrect modeling, it can be checked whether, if A is a subclass of B, all instances of A are also included in the set of instances of B. See also the definition of subsumption in section Subsumption.

23. Ensure asserted single inheritance (Page 78). The authors claim: "... the ontology should be built as an asserted monohierarchy, which means: a hierarchy in which each term has at most one parent". The first argument is that one might gain "certain computational performance benefits". However, this is, in our view, an argument that a modeler shouldn't worry about. First of all, he should have other criteria such as reusability or consistency in mind. Secondly the authors argue: "Indeed, single inheritance is indispensable if the Aristotelian rule is to be applied successfully, since the rule works only if each (nonroot) term in the ontology has exactly one parent". However, a counterexample was not shown and we assume that this is not the case. What speaks against defining a pickup as a vehicle that is both a limousine / car and a truck? Equally, we believe the authors' example on page 80 (Figure 4.5) to be legitimate modeling. Blue vehicles are vehicles that can also be instances of the Blue thing subclass. On the contrary, this type of modeling with Partitioning Classes even opens up some advantages in terms of runtime and memory efficiency. We are convinced that multiple inheritance is in the nature of things and should be adopted precisely in the modeling. Therefore we cannot understand the argument to leave this later to a reasoner: "Computer reasoners can then use the definitions to create a compound ontology, in which single inheritance no longer holds, to address specific application purposes."

24. Both developers and users of an ontology should respect the open-world assumption (OWA) (Page 81). The authors state: "The open world assumption implies that no logical consequences follow from the fact that a given term is not included in an ontology." We agree with this simplified view. However, in discussions about Open World Assumption (OWA) and Close World Assumption (CWA), assumptions and hypotheses that go beyond this are often made that we do not consistently share. See the discussion in section DL Discussion.

25. Adhere to the rule of objectivity, which means: describe what exists in reality, not what is known about what exists in reality (Page 82). "Thus an ontology should not contain classes likeknown allergy,empirically confirmed boson, orunclassified influenza." This corresponds partially to other recommendations like in OBO Foundry Set Naming Guidelines to avoid catch-all terms.

BFO Discussion

On page 88 the hierarchy of the BFO continuants is shown graphically. There the classes1-D, 2-D and 3-D occur both as subclasses ofContinuant fiat boundary and ofSpatial Region. Here the authors either violate the asserted monohierachy principle they set themselves, or the naming of the classes is incorrect, neither of which is apparent from the text.

On page 98 the view is taken that there are relations where it makes no sense to speak of instances. The object propertiesinstance_of andpart_of are given as an example. We have a different point of view on this, because the triple(Mary's_heat, part_of, Mary) can be interpreted as an instance of the object property definition(BodyPart, part_of, HumanBeing). In this respect, the instantiation of object properties behaves exactly like the instantiation of classes. This becomes particularly clear in the case of labeled property graphs.

Objects

On page 91, a BFO:Object is introduced as "spatially extended in three dimensions", "causally unified, meaning its parts are tied together by relations of connections in such a way that if one part of the object is moved in space then its other parts will likely moved also" and "maximally self-connected (which means intuitively that the different parts of these objects are tied together in a certain way and anything that is tied to thes parts in the same way is itself part of the object". We are convinced that this definition is good enough to serve as a universal definition of object within ontologies. The counterpart to this is our definition of knowledge subjects, which is a precise technical definition of sets of tuples of the Knowledge Graph. The is further charaterized on Page 92 with "An object is an entity that can exist and be what it is regardless of what other objects exist". This definition is largely consistent with the definition of Material Persistants in GFO.

Object Aggregates

The BFO continuants hierarchy contains the subclasses Object and Object Aggregate as a subclass of Material entity. On page 93 is then defined "An object aggregate is a material entity that is made up of a collection of objects and whose parts are exactly exhausted by the objects that form the collection". On the one hand, the question arises as to why this construct was not simply modeled using the definition of object properties such asPartOf orMemberOf. Second, there would be a problem instantiating Object Aggregate. Because how should the elements of a set be instantiated without a suitable Object Property?

Occurrents

Occurrents are defined on page 121 as: "occurrent is, more precisely, either an entity that unfolds itself in time, or it is the instantaneous boundary of such an entity". On page 128 the graphic of the associated class hierarchy is shown. Zero-Dimensional Temporal Regions are points in time (Page 124) while One-Dimensional Temporal Regions are time intervals (Page 125). Examples for process boundaries are the beginnings and the endings of processes they bound (Page 123).

BFO History

The graphic on page 128 and the definition on page 122 representHistory as a subclass ofProcess. For one thing, we find it less intuitive in a linguistic sense to say that a history is a process. On the other hand, the definition on page 122 also contradicts this: "history = def. the sum of all processes taking place in the spatiotemporal region occupied by the material entity or site in question". SoHistory would be both a subclass ofProcess and a set of processes at the same time. We think that in this case, too, it would have been more appropriate to define an Object Property SubProcessOf betweenProcess and itself, analogous to the reasoning for Object and Object Aggregate.

Furthermore, it is incomprehensible whyProcess boundary was not modeled as a subclass ofZero-dimensional temporal region. Hence the formulation "Zero-dimensional temporal regions are the temporal regions that process boundaries are located in" is difficult to understand because we think the second class is simply a sublass of the first.

We think that it would also be easier for modelers to understand if the termsPointInTime andTimeInterval had been used in the class hierarchy and if there were also an object property definition(PointInTime, PartOf, TimeInterval).

Extension: deriver.app

deriver.app spiegelt den Text oben ungekürzt; BFO-Klassen und OPs können in der Workbench mit euren Tripeln, OQL-Regeln und der VM abgeglichen werden. Für DL/OWA vergleiche den lokalen Spiegel DL discussion (falls vorhanden) mit der kanonischen DL Discussion auf taoke.de.

Source: taoke.de — Building Ontologies with BFO.

References

[ArSm2015] Robert Arp, Barry Smith, Andrew D. Spear, Building Ontologies with Basic Formal Ontology, The MIT Press, London, England , 2015, ISBN: 978-0-262-52781-1
[ScSe2012] S. Schulz, D. Seddig-Raufie, N. Grews, J. Röhl, D. Schober, M. Boeker, L. Jansen, Guideline on Developing Good Ontologies in the Biomedical Domain with Description Logics, Version 1.0 , 2012, https://www.uni-rostock.de/storages/uni-rostock/Alle_PHF/IPH/media/GoodOD/GoodOD-Guideline_v1_2012.pdf, last visit: 09.04.2026