Categories
Math

Formalization of Equivalency of Axiomatizations

Can we formalize the notion of two axiomatic systems being “equivalent,” especially when they are over different languages?

This question came up in Non-Standard Axioms for Various Math Structures 3 and 4 as well as the Topology as an Algebraic Structure series; it seems to be a general and widely applicable concept.

It seems like this kind of thing would already have a robust standard definition. Nevertheless, I’m not sure what that is, so let us proceed.

I’ve just started studying category theory. My thought: is this encompassed exactly by functors between categories of models? Specifically, if we have two languages (assume we’re working in a “superset” logic like first-order) and a set of axioms for each, and we consider the category of models for each, then is equivalence of the axioms exactly the existence of a functor between the categories?

In fact, might this even show up before category theory, as a homomorphism between the two classes of models (which is sorta what a functor actually is)?

Let’s actually write this out. What would a functor between categories of models look like? Say a fully faithful functor? Well, it’d associate every model of one setting to a model of another in a one-to-one way — which we definitely need to have as part of re-axiomatizability. But we want this association to also preserve some structure, or do something that shows “equivalency” beyond just an arbitrary bijection. Is the functor condition equivalent to this? Well, the functor would preserve morphisms between the models: if there is a morphism between two models in one setting, then there is a corresponding morphism between the corresponding models in the other setting.

Does this imply that two models isomorphic in one setting are isomorphic in the other? The isomorphism must be an arrow from one model to another, where there is a corresponding arrow from the second model to the first … actually, this is not enough though, since we could theoretically have a morphism from A to B and a morphism from B to A without A and B being isomorphic. (Especially if there’s no indication that the morphism is an injection or something.) So how are isomorphisms defined in category theory? This is what I’d need to know to answer this question.

Let’s try to write down a definition of reaxiomatizability between two settings, where we define a setting as language plus set of axioms. It’s really about equivalent information — equivalent data. What does that mean, as a formalization? If I for example have the sets R (all real numbers) and (R,0) (the set of all (r,0) where r is a real number) (for example, this is how R can be embedded into C), then a bijection isn’t enough: r in R can be sent to (s, 0) where r isn’t equal to s. So how can we define “equivalency” in such a way that R gets mapped to (R, 0) in an equivalent way?

Is the distinction merely canonical? Is it based on additional structure we impose and then looking at respecting that structure, but where the choice of structure is largely ad-hoc (and thus the notion of “equivalency of axiomatizations” for different languages would actually be informal?)

The idea of equivalency of axiomatizations is tied to the idea of “equivalent data”: that we can have two different representations of the same object that ultimately “encode the same data.” (“Representation” here is used informally, not in the sense of representation theory.) What does “encoding the same data” mean? Well, really, that idea by itself is the same as encryption/decryption: a pair of functions that go between the encodings which are inverses of each other. And everything should be accounted for; these functions must be bijections.

But actually, apart from that, it doesn’t seem like there’s anything additional formally and in general to distinguish an encoding that provides equivalence. If we have any bijection, we can take that to be canonical and then use that to provide an “equivalent representation” of an object, in a way such that we can go back and forth.

So maybe this is really an informal notion in general, which was specific formal definitions for specific contexts (much like how I understand mathematical logic as a whole, with specific formalizations for certain logics like first-order logic.) Given two classes A and B and a bijection b from A to B, whenever we work with a member of A we can just replace it by its counterpart in B, and similarly we can go back.

But actually, this then raises a question in connection to the setup attempted in the Topology as an Algebraic Structure series: if we “do work” with the counterpart, then can that translate back into implications for the original? In other words, we can maybe consider a notion of equivalency that “respects logical implication.”

Incidentally, before we try that: there’s no issue in terms of defining equivalent axiomatizations formally in the case of groups, if we are already given the definitions for division in terms of multiplication and vice versa. In other words, we can actually consider ourselves to be working in a “superset” language that includes inverse, identity, multiplication, and division, and we already have certain first-order axioms relating these (which essentially define some operations in terms of others); within this language and with these axioms already part of the axiomatization, we can then formally define other sets of axioms/statements in the “superset” language being equivalent (where equivalency is with respect to to the set of “natural defining relations.”) Thus, we don’t have an issue of axiomatizations in different languages; we can consider these to all be part of the same language.

We can do this in general too. If we have two languages, we can simply consider the language that is the union of both, possibly impose some defining relations (or even just conditions), and then ask about equivalency of sets of axioms within this “superset” language. Then, equivalency in a specific case would be better defined by this general formal framework and a specific choice of defining conditions.

But back to our question: if we have two sets and a bijection between them, can we construct a definition of the notion of that bijection respecting logical implication?

Well, what does it mean to “respect logical implication?” It means that we can use information about one “data format” to yield information about the other. In other words, if some statement is true about one “format,” then it is true of the other.

But this connects back exactly to isomorphisms! Specifically, any logical statement must be built up of some symbols. Depending on the symbols present, if the bijection preserved those symbols (in other words, if it’s an isomorphism with respect to those symbols), then the encoding will preserve the statement!

Actually, in general this isn’t even isomorphism — it’s elementary equivalence. Specifically, we can say that our condition is: if we have a language L, and sets A and B which are models of L, then we want a bijection b: A —> B such that any statement in L is true of A if and only if the corresponding statement in B with each variable x from A replaced by b(x) is also true.

Given the examples of real-closed fields not isomorphic to real numbers, we suspect that this is a more general (weaker) condition than isomorphism.

Actually, though, is it, if we’re requiring a bijection anyway?

We can even generalize from this to consider non-bijections, in a similar way that we can go from isomorphisms to homomorphisms. (Of course, the bijection would be needed to correspond informally to equivalence.) In fact, we can even name our terms by “isomorphism” and “homomorphism.” We thus have two definitions:

Definition. Let L be a language and A and B be models of L. A logical homomorphism is a function f: A —> B such that for any statement in L, that statement is true of A if and only if the corresponding statement is true in B, with each variable x in A replaced by f(x).

Definition. The previous definition then corresponds to a logical isomorphism: written out, let L be a language and A and B be models of L. A logical isomorphism is a bijection f: A —> B such that for any statement in L, that statement is true of A if and only if the corresponding statement is true in B, with each variable x in A replaced by f(x).

These terms are of course non-standard, but they do align well with the standard idea of a homomorphism being a structure-preserving function (where here the structure is logical truth in a language), and an isomorphism being a structure-preserving bijection. In fact, this analogy works well formally too: we can define a category in the natural way based off of logical homomorphisms.

Note then that logical homomorphisms are strictly more general than isomorphisms of the language symbols — but this is clearly implied by terminology too. Is it true that logical isomorphisms are the same as isomorphisms of language symbols? And what about homomorphisms?

Before we address that, note that for logical isomorphism, if f is an isomorphism from A to B then the inverse of f is an isomorphism from B to A, and also compositions of isomorphisms are isomorphisms (in fact, compositions of homomorphisms are homomorphisms.) This is of course a special case of morphisms in general categories, as well as a special case in the vast theory of many different kinds of isomorphisms. (Although I’m not sure if or how isomorphisms can fit into the category-theoretic framework.)

Let’s go back to the question at hand. In fact, let’s compile a list of similar questions/ideas:

1. Are logical homomorphisms equivalent to homomorphisms of the language?

2. Are logical isomorphisms equivalent to isomorphisms of the language?

3. Is existence of logical homomorphism equivalent to elementary equivalence, or (in general, beyond first order) statements in A being true if and only if they are true in B? (The difference between these two conceptually is in variables — “statements in A being true if and only if they are true in B” is defined for the scope of statements without free variables, whereas logical homomorphisms are defined to preserve truth even for statements with free variables (where the variables are replaced according to the function.) In particular, existence of logical homomorphism clearly implies non-free-variable-statement equivalence. What we are asking here, then, is: is the other direction true as well?)

Incidentally, we will call non-free-variable statement equivalence “elementary equivalence” regardless of the logic. We can also call a homomorphism preserving symbols a “symbolic homomorphism,” and we can define a “symbolic isomorphism” analogously. So conceptually, we can form the following chain of concepts from weaker to stronger:

  • Elementary equivalence
  • Logically homomorphic
    • Symbolically homomorphic
  • Logically isomorphic
    • Symbolically isomorphic

We know the strong/weak implications non-strictly; what we are now asking is: which parts of this chain are strict and which parts are actually equivalent? (Equivalence could possibly be informally surprising.)

Let’s dive into this. First, we attempt to show that a logical homomorphism must preserve symbols too. This would also imply that logical isomorphisms and symbolic isomorphisms are equivalent, since a logical isomorphism would be a bijection that by the result on homomorphisms also preserves symbols.

But this is clear when we write it out! For example, say we have a binary operation of multiplication. We can form the statement

ab = c.

Since A is closed under multiplication, for any values substituted for a and b there will exist a c. Now, f is a logical homomorphism, so we’d have

f(a)f(b) = f(c) = f(ab).

And this can be written for any values of a and b, thus f preserves multiplication! In an exactly similar way, f preserves each operation in the language. Assuming for now we’re working with first-order logic, f will similarly preserve relations: for any relation R (say for the sake of writing it out that R is binary, although the logic works for R of any arity), since f is a logical homomorphism we have that R(a, b) is equivalent to R(f(a), f(b)), thus f preserves R. It follows that f is a symbolic homomorphism. Consequently, the notions of “logical” and “symbolic” as we’ve defined them coincide, so there’s no need to distinguish them further.

OK, then, a follow-up question: what about homomorphisms and elementary equivalence? Clearly, elementary equivalence doesn’t imply isomorphism, as we can see with examples of real-closed fields like the real algebraic numbers. And we see conceptually that elementary equivalence only involves bound variables, while homomorphisms involve free ones too. But does elementary equivalence imply existence of homomorphism?

To answer this, we can provide a counterexample, given by https://nihalduri.com/2022/11/22/field-automorphisms-of-the-real-numbers/. Any function f from R to R that preserves addition, multiplication, and 1 must be the identity, even without assuming f is a bijection. Thus, the only possible field homomorphism from R to R is the identity. But now the field of real algebraic numbers is elementary equivalent to the real numbers, and there can exist no field homomorphism from R to the real algebraic numbers, since such a homomorphism would need to have image within R, hence must be the identity, but there exist real non-algebraic numbers so such a function can’t take values that “stay within” the real algebraic numbers. QED.

So the distinction between statements with just bound variables and all statements is an important one here, and it shows why the concept of elementary equivalence is different from the concept of “logical truth preserving” maps, aka homomorphisms and isomorphisms.

EDIT: rewrote a little bit above for better clarity, and a few additions here.

On the topic of equivalence, we have two notions:

1. Elementary equivalence: the truth of all bound statements is equivalent

2. Isomorphism: the truth of all statements including with free variables is equivalent upon replacement by the isomorphism image

Clearly, the second notion is stronger than the first.

Now, we had discussed homomorphisms before; where do they fit into this? We can recall representation theory and note that homomorphisms can already yield information even without isomorphisms.

Also, while I am sure of my work above with isomorphisms and elementary equivalence, I’m not sure if I made a mistake in implicitly assuming bijectivity when stating results with homomorphisms. Let’s go over that again, in more detail.

Let’s say we have a homomorphism (in the usual sense, so in our terminology a symbolic homomorphism) from A to B. In representation theory, we use this setup to apply information about B to yield information about A. (Typically A is a “less familiar” setting like an arbitrary group, while B is a “more familiar” setting like a group of matrices over a field.) So in other words, given a homomorphism f from A to B, and given a statement in B (which presumably could involve free variables, otherwise the homomorphism mapping is useless), we should be able to derive a corresponding statement in A. But f isn’t invertible, so what would we replace free variables by?

Well, if f is surjective, then for any value in B we can pick some value in A that maps to it, and then use that as a replacement for free variables in B. So let’s assume we have a surjective homomorphism. Is it the case that any statement in B yields a true statement in A when we pick a replacement of free variables? And if the representation isn’t injective (which we’re assuming for now since we’re dealing with the case that we don’t have an isomorphism), then how do we decide which element in A we should pick to correspond to a given element of B? If f is not injective then it seems in general there’d be no way to do this without the Axiom of Choice, so then we’d expect every possible replacement to yield a true statement in A. But that seems suspect … is that even true?

And if f isn’t even surjective, then the replacement definitely can’t happen. Then, this discussion reduces to replacing B by the subset of B that is the image of f. If f is injective then this is an isomorphism, which we’re not considering here, so we assume regardless without loss of generality that f is surjective but not injective. It seems though then that the “reverse truth” implication from B to A wouldn’t hold.

This actually casts doubt for me on representation theory, or at least on the motivation that is given informally in the beginning of the Wikipedia article. If we have an “unfamiliar” set A and a “familiar” set B, wouldn’t we want a representation of A to mean a homomorphism from B to A as opposed to from A to B?

Also, by our definition of a logical homomorphism, if the reverse truth implication here doesn’t hold in general then wouldn’t be the case that any symbolic homomorphism is a logical homomorphism.

Actually, in an attempt to better clarify this, we can expand our definition of logical homomorphism a bit and introduce some more definitions. Our definition above required that a statement in A is true if and only if the corresponding one in B was true — but as we’ve seen, without an isomorphism, the notion of “corresponding” for the “only if” part can be suspect, and though we can technically define it if we restrict to the image of f in B, such a definition probably wouldn’t correspond to typical desired applications.

Instead, we can define:

Definition. A logical homomorphism is a function f from A to B such that, for every true statement in A, the corresponding statement in B — with free variables replaced by their images under f — is true.

With this definition, it’s clear that symbolic and logical homomorphisms are equivalent, using the free-variable statement (and analogous such statements) ab = c.

Thus, this is the definition of “logical homomorphism” — really, an alternative equivalent definition of homomorphism — that makes sense.

Is it the case though that homomorphisms do satisfy a “reverse truth” implication? It seems that in general they wouldn’t, so let’s prove this now — it seems we could quickly find a counterexample.

So assume we have a homomorphism that’s subjective but not injective. A trivial example would be a group homomorphism from a group G to {e}, given by f(x) = e. This is clearly surjective, and in fact every possible element in G could be a replacement for a free variable in B = {e}. We can easily come up with a statement that is true in B, like x1 = x2 (where x1 and x2 are both free), but where the corresponding replacements in A are not at all true — for this statement, they won’t be true unless A is trivial. So this is a clear counterexample.

We can even come up with a less trivial counterexample. Say we have a group homomorphism from Z6 to Z3 (additive groups) by f(x) = x mod 3. Thus, f fixes 0, 1, and 2, but it assigns f(3) = 0, f(4) = 1, f(5) = 2. Now, what’s a statement that could be true in B? Again, x1 = x2 (where x1 and x2 are both free) works, since 6 is greater than 3 (in fact, that statement works for any case where the homomorphism is between finite sets and from a larger to a smaller set — which it would have to be if it’s surjective but not injective.) Another example of a statement that works would be x^3 = 0 (in group multiplication notation, but the actual example groups are additive), since this is true of every element in Z3, but not in Z6. Thus, we have many counterexamples at hand to disprove the reverse truth claim for homomorphisms.

The conclusion of this work can be summarized as the following insights:

1. Equivalence of axiomatic systems is an informal notion, beyond just bijections. “More equivalence” beyond this is an informal notion that corresponds formally to bijections that respect a certain kind of structure, so really this boils down to isomorphisms.

2. Logical truth equivalence is dependent on the symbols making up the logical statements, and hence on the language.

3. There are two notions of logical truth equivalence: elementary equivalence, which just concerns bound variable statements, and isomorphisms, which concerns all statements, including those with free variables. Statements involving free variables need those variables to be replaced to make sense in the new domain, and this replacement is defined via the isomorphism function.

4. We thus have two different ways of understanding the “ultimate” equivalence that is given by isomorphisms (“ultimate” being used informally) — whether the equivalence of data formats where the equivalence respects some structure, or the equivalence of truth of all statements, including those with free variables.

5. Homomorphisms are equivalent to statement implication from one domain to the other, but not necessarily the other way around (unless they are in fact isomorphisms.) (This means all statements, including those with free variables, which again would be replaced by the homomorphism function.)

6. For a surjective homomorphism, it is technically possible to define the notion of a statement from the second domain implying a corresponding statement from the first domain, but surjective homomorphisms don’t actually satisfy this in general (unless of course they are injective and thus isomorphisms.)

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.