8-valued (+raw binary octuple) logic : good idea?
Moderator: Alyrium Denryle
- Starglider
- Miles Dyson
- Posts: 8709
- Joined: 2007-04-05 09:44pm
- Location: Isle of Dogs
- Contact:
8-valued (+raw binary octuple) logic : good idea?
This is the kind of thing I'd usually talk about on a specialist technical forum, but I had a notion that people here might be interested in what I do all day, and this is sufficiently non-AI-specific that the mathematicians here might have good suggestions for me. I'm good at systems architecture, algorithm design and micro-optimisation, but my knowledge of formal logic is limited to what my compsci theory classes covered, plus what I've picked up from assorted books and papers.
Anyway, I recently restructured the underpinnings of the AI system I'm working on, which spends most of its time doing abstract reasoning about software systems. This used to consist of five basic layers;
* The model level. This consists of assorted data structures that model some part of reality, or a hypothetical situation.
* The Boolean inference level. This constructs a 'support graph', which marks parts of the model level as 'proven true' or 'proven false' and keeps track of which inference rules were used to prove what. In modelling external reality this is mainly used to detect and discard incoherent propositions, but in modelling executing code you can prove a lot of things directly true or false. Boolean logic is fast at the proposition level (particularly because with bit-level DNF tables you can trade space for time and avoid branching in medium-complexity expressions) but of limited power and often slow in practice because it's hard to predict which inference paths will produce something useful.
* The defeasible inference level. Defeasible inference basically means using heuristics as if they were Boolean inference rules. Inevitably this produces conflicting results; usually an obvious general case being overridden by a less-obvious special case. The classic (i.e. used in most of the literature) solution to this is to give the rules simple integer priorities; high-priority rules overrule low-priority ones (Boolean-equivalent rules have infinite priority). This renders the logic 'non-monotonic', which means that the sets of true and false statements can shrink as well as grow (in Boolean logic they can only grow as inference proceeds). This complicates inference control considerably. On the plus side very naive inference control approaches work fairly well on defeasible logic.
* The probabilistic inference level. This implements the generalised probability calculus (i.e. Bayesian reasoning; non-naive but obviously non-complete). Most of the actual predictions, save proving things about code, are made via probabilistic inference, but it's relatively slow, complicated and space-hungry. Probabilistic reasoning, as you'd expect, assigns every proposition (usually, a constraint on a model's state history) a probability. Most of the actual work is done by probability distributions over sets of exclusive states. The various so-called 'deficiencies' of Bayesian reasoning are (AFAIK) fully covered by assorted kinds of reflection (meta-probabilistic reasoning).
* The cognitive utility level. This doesn't map as neatly onto the lower levels as the previous three. It isn't relevant to my main point here, so I'll just say that the system keeps track of all resource usage (memory and processing time) and allocates it based on expected utility.
This model is theoretically neat because this is essentially the inferential path the system goes down when attempting to reproduce its own source code from first principles (getting this working properly/fully is a major research goal). However in practice the defeasible layer wasn't really pulling its weight. It isn't that much less expensive to evaluate in practice than the probabilistic layer, the highly non-monotonic nature of rule overriding makes it inconsistent under limited time and tough for the cognitive-EU system to predict, and it's hard for the system to automatically generate good rule priorities without doing masses of trial and error, even when working from a probabilistic template (because in practice defeasible reasoning tends to take a rather different path through the inference space compared to probabilistic logic, and doing a naive conversion without a detailed overriding analysis tends to produce worse results than just using probabilistic reasoning in the first place).
The new design folds the Boolean and defeasible layers into one and incorporates one of the more commonly used kinds of Boolean meta-inference (which should keep Godel happy if nothing else). The basic logic system has eight values;
TRUE - Defined as true or proven true by Boolean inference.
FALSE - Defined as false or proven false by Boolean inference.
UNKNOWN - No information about this proposition yet.
INCONSISTENT - Proven both true and false by Boolean inference (i.e. in the 'fallout plume' of two or more inconsistent axioms).
ASSUMED_TRUE - Implied true by defeasible inference.
ASSUMED_FALSE - Implied false by defeasible inference.
CONFLICTED - Implied both true and false by defeasible inference.
UNPROVABLE - Proven to be unreachable (undeterminable) with the currently available Boolean axioms and inference rules.
Adding decent inconsistency tolerance seems to be a good solution to using defeasible logic only where beneficial, before switching over to probabilistic logic; we can continue to prove some things despite inconsistency, and when that fails we can backtrack to the troublesome axioms and selectively switch to probabilistic in the interesting bits of the inference tree. As a bonus, this works on Boolean logic with broken axioms, as opposed to blowing up and forcing an external 'fix this' routine to kick in.
Each of those gets assigned a bit in a byte, so that I can select any comination of logic values with a simple bitmask while still storing the values in bytes (which booleans took anyway unless custom-packed; the new values take 3 bits packed uncompressed or 2.2 bits packed with bitstream compression). There's a 100% effective cache hit over three-value logic (underspecified Boolean) in packed lookup tables (because it isn't usually practical to pack more than 8 values to the word), but that's tolerable given that most of the marginal utility of defeasible reasoning is now included for free.
Of course this representation would be horribly cumbersome for actual calculations (think 'nasty rat's nest of ifs'). Thus I also have a 'raw representation', which consists of a vector of eight boolean flags representing the information we have about a proposition;
IS_TRUE
IS_TRUE_VIA_INCONSISTENT
IS_FALSE
IS_FALSE_VIA_INCONSISTENT
ASSUMED_TRUE
ASSUMED_TRUE_VIA_CONFLICTED
ASSUMED_FALSE
ASSUMED_FALSE_VIA_CONFLICTED
These are orthogonal in the sense that they can all be set independently and all 256 possible values are legal, but obviously this representation is ambigious since there are only 8 logically distinct states. The conversion is slightly lossy; firstly there's no raw equivalent of UNPROVABLE (it's equivalent to UNKNOWN: no flags). Secondly the raw representation tracks more information about inconsistency than the canonical representation. Within a single expression, if the result is independently proven true/false via a consistent source and proven true/false via an inconsistent/conflicted source, then it comes out as true/false. If it only has inconsistent sources, or if any kind of true+false combination comes up, it becomes inconsistent. The canonical representation discards information about exactly why a given value is inconsistent; this info could in principle speed up post-inconsistency-resolution inference tree cleanup slightly, but it's trivial to re-derive in most practical cases and the ability to select sets of propositions with a subset of the canonical values by bitmask (gained by assigning them non-overlapping values within a byte) is much more useful.
Anyway, the raw representation allows for fast expression processing via simple bitwise operations, because of the way it accumulates information about a proposition during expression evaluation. Preliminary microbenchmarks suggest that this will be less than 50% slower than simple boolean and something like four times as fast as the old defeasible implementation. At some point I'll probably optimise this to use SSE2 bitwise operations, though I doubt the gain will be as significant as using SSE to vectorise the probabilistic logic.
Enough about implementation, does this logic system look sane to theory types?
Anyway, I recently restructured the underpinnings of the AI system I'm working on, which spends most of its time doing abstract reasoning about software systems. This used to consist of five basic layers;
* The model level. This consists of assorted data structures that model some part of reality, or a hypothetical situation.
* The Boolean inference level. This constructs a 'support graph', which marks parts of the model level as 'proven true' or 'proven false' and keeps track of which inference rules were used to prove what. In modelling external reality this is mainly used to detect and discard incoherent propositions, but in modelling executing code you can prove a lot of things directly true or false. Boolean logic is fast at the proposition level (particularly because with bit-level DNF tables you can trade space for time and avoid branching in medium-complexity expressions) but of limited power and often slow in practice because it's hard to predict which inference paths will produce something useful.
* The defeasible inference level. Defeasible inference basically means using heuristics as if they were Boolean inference rules. Inevitably this produces conflicting results; usually an obvious general case being overridden by a less-obvious special case. The classic (i.e. used in most of the literature) solution to this is to give the rules simple integer priorities; high-priority rules overrule low-priority ones (Boolean-equivalent rules have infinite priority). This renders the logic 'non-monotonic', which means that the sets of true and false statements can shrink as well as grow (in Boolean logic they can only grow as inference proceeds). This complicates inference control considerably. On the plus side very naive inference control approaches work fairly well on defeasible logic.
* The probabilistic inference level. This implements the generalised probability calculus (i.e. Bayesian reasoning; non-naive but obviously non-complete). Most of the actual predictions, save proving things about code, are made via probabilistic inference, but it's relatively slow, complicated and space-hungry. Probabilistic reasoning, as you'd expect, assigns every proposition (usually, a constraint on a model's state history) a probability. Most of the actual work is done by probability distributions over sets of exclusive states. The various so-called 'deficiencies' of Bayesian reasoning are (AFAIK) fully covered by assorted kinds of reflection (meta-probabilistic reasoning).
* The cognitive utility level. This doesn't map as neatly onto the lower levels as the previous three. It isn't relevant to my main point here, so I'll just say that the system keeps track of all resource usage (memory and processing time) and allocates it based on expected utility.
This model is theoretically neat because this is essentially the inferential path the system goes down when attempting to reproduce its own source code from first principles (getting this working properly/fully is a major research goal). However in practice the defeasible layer wasn't really pulling its weight. It isn't that much less expensive to evaluate in practice than the probabilistic layer, the highly non-monotonic nature of rule overriding makes it inconsistent under limited time and tough for the cognitive-EU system to predict, and it's hard for the system to automatically generate good rule priorities without doing masses of trial and error, even when working from a probabilistic template (because in practice defeasible reasoning tends to take a rather different path through the inference space compared to probabilistic logic, and doing a naive conversion without a detailed overriding analysis tends to produce worse results than just using probabilistic reasoning in the first place).
The new design folds the Boolean and defeasible layers into one and incorporates one of the more commonly used kinds of Boolean meta-inference (which should keep Godel happy if nothing else). The basic logic system has eight values;
TRUE - Defined as true or proven true by Boolean inference.
FALSE - Defined as false or proven false by Boolean inference.
UNKNOWN - No information about this proposition yet.
INCONSISTENT - Proven both true and false by Boolean inference (i.e. in the 'fallout plume' of two or more inconsistent axioms).
ASSUMED_TRUE - Implied true by defeasible inference.
ASSUMED_FALSE - Implied false by defeasible inference.
CONFLICTED - Implied both true and false by defeasible inference.
UNPROVABLE - Proven to be unreachable (undeterminable) with the currently available Boolean axioms and inference rules.
Adding decent inconsistency tolerance seems to be a good solution to using defeasible logic only where beneficial, before switching over to probabilistic logic; we can continue to prove some things despite inconsistency, and when that fails we can backtrack to the troublesome axioms and selectively switch to probabilistic in the interesting bits of the inference tree. As a bonus, this works on Boolean logic with broken axioms, as opposed to blowing up and forcing an external 'fix this' routine to kick in.
Each of those gets assigned a bit in a byte, so that I can select any comination of logic values with a simple bitmask while still storing the values in bytes (which booleans took anyway unless custom-packed; the new values take 3 bits packed uncompressed or 2.2 bits packed with bitstream compression). There's a 100% effective cache hit over three-value logic (underspecified Boolean) in packed lookup tables (because it isn't usually practical to pack more than 8 values to the word), but that's tolerable given that most of the marginal utility of defeasible reasoning is now included for free.
Of course this representation would be horribly cumbersome for actual calculations (think 'nasty rat's nest of ifs'). Thus I also have a 'raw representation', which consists of a vector of eight boolean flags representing the information we have about a proposition;
IS_TRUE
IS_TRUE_VIA_INCONSISTENT
IS_FALSE
IS_FALSE_VIA_INCONSISTENT
ASSUMED_TRUE
ASSUMED_TRUE_VIA_CONFLICTED
ASSUMED_FALSE
ASSUMED_FALSE_VIA_CONFLICTED
These are orthogonal in the sense that they can all be set independently and all 256 possible values are legal, but obviously this representation is ambigious since there are only 8 logically distinct states. The conversion is slightly lossy; firstly there's no raw equivalent of UNPROVABLE (it's equivalent to UNKNOWN: no flags). Secondly the raw representation tracks more information about inconsistency than the canonical representation. Within a single expression, if the result is independently proven true/false via a consistent source and proven true/false via an inconsistent/conflicted source, then it comes out as true/false. If it only has inconsistent sources, or if any kind of true+false combination comes up, it becomes inconsistent. The canonical representation discards information about exactly why a given value is inconsistent; this info could in principle speed up post-inconsistency-resolution inference tree cleanup slightly, but it's trivial to re-derive in most practical cases and the ability to select sets of propositions with a subset of the canonical values by bitmask (gained by assigning them non-overlapping values within a byte) is much more useful.
Anyway, the raw representation allows for fast expression processing via simple bitwise operations, because of the way it accumulates information about a proposition during expression evaluation. Preliminary microbenchmarks suggest that this will be less than 50% slower than simple boolean and something like four times as fast as the old defeasible implementation. At some point I'll probably optimise this to use SSE2 bitwise operations, though I doubt the gain will be as significant as using SSE to vectorise the probabilistic logic.
Enough about implementation, does this logic system look sane to theory types?
- Starglider
- Miles Dyson
- Posts: 8709
- Joined: 2007-04-05 09:44pm
- Location: Isle of Dogs
- Contact:
Oh, forgot to add, I haven't implemented it yet but as far as I can see this logic is wholly sufficient for doing tableau analysis of propositions in place without spawning a dedicated Boolean support subgraph, which is a significant win (subject to multithreading issues, but then everything interesting is subject to multithreading issues),
- Darth Wong
- Sith Lord
- Posts: 70028
- Joined: 2002-07-03 12:25am
- Location: Toronto, Canada
- Contact:
I wonder if there are any programmers around here who are serious enough about this type of work to respond. I know I'm certainly nowhere near your level of programming ability and so I wouldn't even attempt to comment on something like this, other than "looks neat".
"It's not evil for God to do it. Or for someone to do it at God's command."- Jonathan Boyd on baby-killing
"you guys are fascinated with the use of those "rules of logic" to the extent that you don't really want to discussus anything."- GC
"I do not believe Russian Roulette is a stupid act" - Embracer of Darkness
"Viagra commercials appear to save lives" - tharkûn on US health care.
http://www.stardestroyer.net/Mike/RantMode/Blurbs.html
"you guys are fascinated with the use of those "rules of logic" to the extent that you don't really want to discussus anything."- GC
"I do not believe Russian Roulette is a stupid act" - Embracer of Darkness
"Viagra commercials appear to save lives" - tharkûn on US health care.
http://www.stardestroyer.net/Mike/RantMode/Blurbs.html
Tri-value logic (true,false, undefined/null) is used heavily in databases, and almost anywhere else which needs to store non-trivial data structures. But it's the undefined value which is the important part as that allows empty referances to somewhere else while being distint to actual legal values.
In a sufficiently complex system, you generally end up getting something which looks fairly similar to the 1st list of states but tailored for the problem domain. Lots of people have problems with boolean logic, tri-value/Ternary boolean logic sends a frightning number of people running for the hills.
I dont know enough theory to decide on the usefulness of such a setup, but it is damn easy to make an ad-hoc duplicate with only mildly complicated buisness rules.
In a sufficiently complex system, you generally end up getting something which looks fairly similar to the 1st list of states but tailored for the problem domain. Lots of people have problems with boolean logic, tri-value/Ternary boolean logic sends a frightning number of people running for the hills.
I dont know enough theory to decide on the usefulness of such a setup, but it is damn easy to make an ad-hoc duplicate with only mildly complicated buisness rules.
"Okay, I'll have the truth with a side order of clarity." ~ Dr. Daniel Jackson.
"Reality has a well-known liberal bias." ~ Stephen Colbert
"One Drive, One Partition, the One True Path" ~ ars technica forums - warrens - on hhd partitioning schemes.
"Reality has a well-known liberal bias." ~ Stephen Colbert
"One Drive, One Partition, the One True Path" ~ ars technica forums - warrens - on hhd partitioning schemes.
Of course, let me preface by saying: I understand very little of this. But the fact that you can't injectively map the actual truth values of propositions into the vector you've set up to describe propositions seems like a problem. Is there a way to adjust your second list of eight values so that the first list maps into the second list?
A Government founded upon justice, and recognizing the equal rights of all men; claiming higher authority for existence, or sanction for its laws, that nature, reason, and the regularly ascertained will of the people; steadily refusing to put its sword and purse in the service of any religious creed or family is a standing offense to most of the Governments of the world, and to some narrow and bigoted people among ourselves.
F. Douglass
- Terralthra
- Requiescat in Pace
- Posts: 4741
- Joined: 2007-10-05 09:55pm
- Location: San Francisco, California, United States
I am a fairly competent programmer, and this is still way out of my areas of expertise. I'm not sure if it's that Starglider's focus areas are so distinct from mine, or just that he's so much better than me.Darth Wong wrote:I wonder if there are any programmers around here who are serious enough about this type of work to respond. I know I'm certainly nowhere near your level of programming ability and so I wouldn't even attempt to comment on something like this, other than "looks neat".
I'm not sure your vector[8] is really orthogonal, though. Setting IS_TRUE and IS_FALSE, for example, ought not be possible. IS_TRUE_VIA_INCONSISTENT and IS_FALSE_VIA_INCONSISTENT aren't mutually exclusive, though, so that I can understand.
If, as Xon mentioned, you used a Ternary system, you could set IS_TRUE and/or IS_FALSE to be undefined while the heuristics are going through the infeasible algorithms, but 8 trit words seems like it would be an even more substantial performance hit.
I guess what I'm really getting at it, in the kind of multi-dimensional system you're setting up, trying to boil it down to a vector of boolean flags seems bad.
- Ariphaos
- Jedi Council Member
- Posts: 1739
- Joined: 2005-10-21 02:48am
- Location: Twin Cities, MN, USA
- Contact:
Re: 8-valued (+raw binary octuple) logic : good idea?
I can only really comment on your implementation, since my logic isn't advanced enough to see any real flaws. I'm not sure why you're limited to eight values either, at least on a modern computer with 32 or even 64-bit wording, but anyway...Starglider wrote:IS_TRUE
IS_TRUE_VIA_INCONSISTENT
IS_FALSE
IS_FALSE_VIA_INCONSISTENT
ASSUMED_TRUE
ASSUMED_TRUE_VIA_CONFLICTED
ASSUMED_FALSE
ASSUMED_FALSE_VIA_CONFLICTED
Here you have a situation where, if you find something is conflicted or inconsistent, you need to use an additional operation to assign the value.
If, instead, you used four bits to store the states
Bit 0: Boolean FALSE
Bit 1: Boolean TRUE
Bit 2: Assumed FALSE
Bit 3: Assumed TRUE
You can let INCONSISTENT = 0x3, CONFLICTED = 0xC, UNKNOWN=0, and still have bit density left for UNPROVABLE (0xF?).
You would then be able to construct standard boolean functions out of it, assuming you filtered out unprovables.
- Starglider
- Miles Dyson
- Posts: 8709
- Joined: 2007-04-05 09:44pm
- Location: Isle of Dogs
- Contact:
The UNPROVABLE value is a bit of an anomoly. There's an injective mapping from the first seven canonical values onto the raw values (the vectors are 10000000, 00001000, 00000000, 01000100, 00100000, 00000010 and 00010001). In principle I could add a ninth bit to preserve 'unprovable' status, or I could do constant redundancy elimination to narrow the set of 'valid' raw values and assign 'unprovable' to the remainder (much like 'not a number' is assigned to all the otherwise-illegal IEE754 floating point bit patterns). But this would be a lot of hassle for no tangible gain, because the two encodings perform quite different functional roles. The raw encoding is essentially a convenience to make Boolean expression evaluation faster. As an input to a standard, non-reflective expression we want UNPROVABLE to act just like UNKNOWN, which it does under this scheme. Meanwhile the UNPROVABLE value is also a convenience; it has no role in direct inference, it just utilises a spare value (in one byte independent or three bit compact) in the logic encoding to store a piece of information that would otherwise be accessible only through the reflection API. This is an efficiency win for simple reflective inference (i.e. some kinds of tableau reasoning).Surlethe wrote:Is there a way to adjust your second list of eight values so that the first list maps into the second list?
You're right to point out that UNPROVABLE seems inconsistent/pointless/redundant without a definition of the additional (reflective) logic that actually uses it. When I finalise the formal spec for this I'll add a note to that effect.
The key word there is 'ought'. If you enter '1+1=2' and '1+1=3' as axioms, a huge swath of mathematical statements now become simultaneously true and false (technically all of them if you're not being explicitly paraconsistent, but in practice directed inference does a reasonable job of being 'informally paraconsistent' - this sacrifices monoticity but not as much as classic defeasible logic). Normally what happens is that a TRUE proposition gets updated by a FALSE value as a result of inference.The update operation is logically equivalent to converting the existing and updating logic values into raw values, combining them with a bitwise or operation, then converting that back into a canonical value - thus the result is that it gets updated to INCONSISTENT. However for compound inference rules you can get IS_TRUE and IS_FALSE appearing within a single function (usually when it has been parallelised at the bit level - something this system tries to do wherever possible).Terralthra wrote:I'm not sure your vector[8] is really orthogonal, though. Setting IS_TRUE and IS_FALSE, for example, ought not be possible.
So anyway, having IS_TRUE and IS_FALSE set at once is legal; it maps to an INCONSISTENT output.
The key word there is 'via'. The VIA_INCONSISTENT flags track inconsistency in input when evaluating an inference rule. This is a deliberate choice on my part to be slightly more tolerant of inconsistency than pure Boolean logic dictates; if a statement is been proven true or false from a thought-to-be-consistent set of axioms, but is been proven true /and/ false from a known-to-be-inconsistent set, I ignore the inconsistent result and go with the consistent one (because IS_TRUE and IS_FALSE overrule their VIA_INCONSISTENT equivalents). However this only works within a single inference step, because the VIA_INCONSISTENT and VIA_CONFLICTED info (i.e. what the value would be if we ignored all currently known inconsistency) isn't (usually) stored in the main graph.IS_TRUE_VIA_INCONSISTENT and IS_FALSE_VIA_INCONSISTENT aren't mutually exclusive, though, so that I can understand.
I admit that this makes the logic less pleasant from a theoretical point of view, because an important aspect of the behaviour (i.e. how fast and far INCONSISTENT and CONFLICTED) propagate depends on when you do raw/canonical conversions. If that was causing a problem, I'd store logic values as a 16-bit value that combines a raw value and a canonical value, with the later always synchronized with the former after an inference step.
In practice however it works quite well (or rather, it seems to so far and I predict that it will continue to do so based on the huge amount of non-monoticity the old defeasible reasoning generated and coped with), because the propagation behaviour of INCONSISTENT and CONFLICTED only impacts performance, not the actual conclusions. Essentially, as soon as an INCONSISTENT or CONFLICTED value appears, you know the whole thing is broken and you'll have to come and clean up before you can conclude anything. In practice, 'cleaning up' means invoking some combination of probabilistic and reflective reasoning - which are considerably (typically at least an order of magnitude) slower. The value of inconsistency tolerance, or rather the limited amount of inconsistency tolerance that I've implemented, is that you can generate more useful information with fast inference before having to do that, which raises performance. The intent here is to do what the old defeasible logic did (which wasn't suitable for derriving actual conclusions either) but more efficiently.
IS_TRUE being unset doesn't mean we know the proposition is false, it just means that we don't know that it's true. If you just used the set of values TRUE, FALSE and UNKNOWN (and halt-catch-fire if and when an INCONSISTENT result comes up), then you'd be using classic Ternary logic. In fact the old Boolean layer did this, because it was underspecified, which does need three values whenever you have a vector/matrix representation.If, as Xon mentioned, you used a Ternary system, you could set IS_TRUE and/or IS_FALSE to be undefined while the heuristics are going through the infeasible algorithms,
There isn't any value to making the raw flags tri-state, because their states are effectively 'true' and 'unknown', not 'true' and 'false'. The 'false' state is represented by their complementary flag being set. There are two major advantages of using two complementary sets of four binary flags instead of a single set of four tri-state flags. Firstly this is easier to visualise because it maps directly onto machine bytes, as opposed to the minor but irritating obfuscation involved in mapping trits onto pairs of bits. Secondly, we can encode inconsistent inputs and outputs directly and inconsistency propagates through an expression during evaluation automatically. You can visualise this as a pair of quad-state (t, f, unknown, inconsistent) flags if you like, one for Boolean, one for assumed (basically defeasible logic with only two priority levels), but that hides what's going on at the bit level. With a strict tri-state representation you'd have to put in explicit if checks all over the place to detect inconsistency, since there's no way to pass an inconsistent result through to further processing (like a NaN floating point value does). This is bad enough if you're just halting and catching fire at the first sign of inconsistency, which the old Boolean reasoner did, but it's much worse if you want to implement inconsistency tracking and tolerance.but 8 trit words seems like it would be an even more substantial performance hit.
I don't know where that intuition comes from. The vector representation simply parameterises the information content of the canonical logic states (with the exception of UNPROVABLE as noted above, which is an intrusion of first level reflection into the basic reasoning layer). Parameterising complex state into a set of simple variables is a standard technique in programming and modelling in general.I guess what I'm really getting at it, in the kind of multi-dimensional system you're setting up, trying to boil it down to a vector of boolean flags seems bad.
- Terralthra
- Requiescat in Pace
- Posts: 4741
- Joined: 2007-10-05 09:55pm
- Location: San Francisco, California, United States
Ahhhh. I was coming at this from a different perspective. In the area I work with (computer modeling of phonological and linguistic analysis), boolean values most typically represent T/F values, presence or absence. Think Chomsky's distinctive features - a speech sound might be classified according to +/- high and +/- low, and it's possible to have -high -low, but +high +low would be impossible.
Having the bools be true/unknown instead changes a lot.
I'll reread a bit with that in mind, cogitate, and probably post again later this weekend.
Having the bools be true/unknown instead changes a lot.
I'll reread a bit with that in mind, cogitate, and probably post again later this weekend.
- Starglider
- Miles Dyson
- Posts: 8709
- Joined: 2007-04-05 09:44pm
- Location: Isle of Dogs
- Contact:
Good point. I said 'vector of eight Boolean flags' when I should have said said 'vector of eight binary flags' (when describing the raw representation).Terralthra wrote:Ahhhh. I was coming at this from a different perspective. In the area I work with (computer modeling of phonological and linguistic analysis), boolean values most typically represent T/F values, presence or absence.
If nothing else this will help me to avoid making such mistakes in the formal spec. document.
Which is fine in normal programming, because if an illegal value comes up something has gone horribly wrong and you want to halt as soon as possible. Letting errors propagate for a while, in a controlled way, is a technique of rather specialist utility. This is why the mere existence of the 'Not A Number' value in the IEE754 floating point standard causes so much controversy; it can cause programs to fail silently and in unexpected ways (usually due to unguarded divide-by-zeros), while only a few users really need it. Of course my code does use it, as an error value in probabilistic reasoning.Think Chomsky's distinctive features - a speech sound might be classified according to +/- high and +/- low, and it's possible to have -high -low, but +high +low would be impossible.
- Ariphaos
- Jedi Council Member
- Posts: 1739
- Joined: 2005-10-21 02:48am
- Location: Twin Cities, MN, USA
- Contact:
I've seen the general approach - having two-bit values representing an extension to boolean logic. Something can be true, false, neither, or both.Terralthra wrote:Ahhhh. I was coming at this from a different perspective. In the area I work with (computer modeling of phonological and linguistic analysis), boolean values most typically represent T/F values, presence or absence. Think Chomsky's distinctive features - a speech sound might be classified according to +/- high and +/- low, and it's possible to have -high -low, but +high +low would be impossible.
Having the bools be true/unknown instead changes a lot. :D
This just seems like an extension to that, having two separate ideas of truth or falsehood, thus my suggestion above.
- Starglider
- Miles Dyson
- Posts: 8709
- Joined: 2007-04-05 09:44pm
- Location: Isle of Dogs
- Contact:
Re: 8-valued (+raw binary octuple) logic : good idea?
I'm not explicitly limited to eight values, but there are efficiency advantages to keeping the number of values as small as possible;Xeriar wrote:I'm not sure why you're limited to eight values either, at least on a modern computer with 32 or even 64-bit wording,
1) Unpacked canonical logic values can be put in the smallest addressible unit of memory (a byte), which is good for densely encoded graphs (particularly fitting them into L2 and L3 cache).
2) Packed logic values take up only 3 bits. This is good for fitting function lookup tables into L1 cache (normally at eight values to the word).
3) At one byte per raw value, you can do up to 8 logic 'micro ops' in parallel with a single bitwise instruction (16 when I implement this with SSE2). This results in major speed-ups in common expresions. 8 bits is ideal for SSE2 because you can do pack/unpack/compare/collapse operations directly on logic values. Anything field size other than 8, 16 or 32 would require padding to do this, which would waste parallelism, or forego these operations entirely and rely on shifts.
I could assign UNPROVABLE to raw value 0xFF in the current scheme, but that would require me to strip redundancies. In fact if I ever wanted to persistently store raw values without removing UNPROVABLE info, I would do just that. However that sacrifices the ability to feed raw values back into calculations (without doing another special case check), because UNPROVABLE now has the effective value of INCONSISTENT (because it sets TRUE and FALSE), not UNKNOWN. Currently there's no desire to do this because raw values are only used inside expression evaluation, where the UNKNOWN/UNPROVABLE distinction is irrelevant.If, instead, you used four bits to store the states
Bit 0: Boolean FALSE
Bit 1: Boolean TRUE
Bit 2: Assumed FALSE
Bit 3: Assumed TRUE
You can let INCONSISTENT = 0x3, CONFLICTED = 0xC, UNKNOWN=0, and still have bit density left for UNPROVABLE (0xF?).
I could also drop the source inconsistency tracking from the raw representation to half its field size. This would double the theoretical amount of per-instruction parallelism I could achieve. However packing at that density would effectively wipe out the gains of using SSE2, because nibbles aren't directly addressible. When I do convert this to use SSE2, it will achieve the same increase in parallelism as your proposal without sacrificing the inconsistency tolerance. Which is good, because that property is there for a reason; it significantly increases the work that this reasoning layer can do before resorting to probabilistic reasoning.
- Kuroneko
- Jedi Council Member
- Posts: 2469
- Joined: 2003-03-13 03:10am
- Location: Fréchet space
- Contact:
I'm sorry to say that one of the first things that came to my mind when reading your post was the Ineffable Truth Cube.
I don't mean that in any disparaging sense. Let me try to explain why. I'm not a programmer, so I didn't even bother thinking about efficiencies of particular implementations, and instead simply wondered how about the structure of the logic and how (and if) the same sort of setup might be repeated for probabilistic logic. Perhaps instead of of truth and falsity being formally independent, leading to, e.g., inconsistent as "T&F", with a continuum of values, we could have inconsistency as "1-[Pr(T)+Pr(F)]", with formal independence along each of those dimensions? In any case, I conceptually thought a set of four basic values for the "pseudo-Boolean" part and a mirror of the same for the "defeasible" part.
Label the logical values in the following manner: U = unknown (indeterminate), O = inconsistent (overdeterminate), T = true, F = false, and define the operators, in cycle notation: - = (TF)(OU), i = (TOFU), j = (TO)(FU), and either k = (TF), and 1 be the identity. In other words, -1 negates everything, i is negation in the sense of Post, k is negation in the sense of Kleene (only logical part), and j is both. Then i² = -1, j² = +1, k² = +1, kj = i, ki = j, ij = k, and so on. The set S = {±1,±i,±j,±k} forming an eight-element group under composition isomorphic to the dihedral group D_4, of the same order (symmetries of a square).
The set S forms a basis for the coquaternion algebra [2], and peforming a Cayley-Dickson construction over coquaternions with another root of unity, l² = 1, produces the split-octonion algebra, which is nominally what we want, since there are two "layers" ('pseudo-Boolean' and 'defeasible'), with l as an operator that goes between them. The basis, with negatives, forms a Moufang loop that has a structure of a 'twisted' prismatic dihedral group D_4h = D_4×C_2 (symmetries of the cube considered as a prism), now antiassociative, (xy)z = -x(yz).
---
[1] There are several alternatives for these identifications that preserve the multiplication table. In the dihedral group, i can be either the 90° or 270° rotation and j,k are suitable reflections. I choose there particular values because with this choice, k cleanly corresponds to Kleene negation (among other reasons of taste). Other identifications would be more natural for 2x2 matrix representations.
[2] The value (T²+F²)-(O²+U²) may be interpreted as a measure of our knowledge of the proposition, or perhaps (T²-F²)+(1/O²-U²) as suitability for inference, or some other variation with two addends and two subtrahends. This formally corresponds to the coquaternion modulus.
[Minor edit: replaced jk with kj above.]
I don't mean that in any disparaging sense. Let me try to explain why. I'm not a programmer, so I didn't even bother thinking about efficiencies of particular implementations, and instead simply wondered how about the structure of the logic and how (and if) the same sort of setup might be repeated for probabilistic logic. Perhaps instead of of truth and falsity being formally independent, leading to, e.g., inconsistent as "T&F", with a continuum of values, we could have inconsistency as "1-[Pr(T)+Pr(F)]", with formal independence along each of those dimensions? In any case, I conceptually thought a set of four basic values for the "pseudo-Boolean" part and a mirror of the same for the "defeasible" part.
Label the logical values in the following manner: U = unknown (indeterminate), O = inconsistent (overdeterminate), T = true, F = false, and define the operators, in cycle notation: - = (TF)(OU), i = (TOFU), j = (TO)(FU), and either k = (TF), and 1 be the identity. In other words, -1 negates everything, i is negation in the sense of Post, k is negation in the sense of Kleene (only logical part), and j is both. Then i² = -1, j² = +1, k² = +1, kj = i, ki = j, ij = k, and so on. The set S = {±1,±i,±j,±k} forming an eight-element group under composition isomorphic to the dihedral group D_4, of the same order (symmetries of a square).
The set S forms a basis for the coquaternion algebra [2], and peforming a Cayley-Dickson construction over coquaternions with another root of unity, l² = 1, produces the split-octonion algebra, which is nominally what we want, since there are two "layers" ('pseudo-Boolean' and 'defeasible'), with l as an operator that goes between them. The basis, with negatives, forms a Moufang loop that has a structure of a 'twisted' prismatic dihedral group D_4h = D_4×C_2 (symmetries of the cube considered as a prism), now antiassociative, (xy)z = -x(yz).
---
[1] There are several alternatives for these identifications that preserve the multiplication table. In the dihedral group, i can be either the 90° or 270° rotation and j,k are suitable reflections. I choose there particular values because with this choice, k cleanly corresponds to Kleene negation (among other reasons of taste). Other identifications would be more natural for 2x2 matrix representations.
[2] The value (T²+F²)-(O²+U²) may be interpreted as a measure of our knowledge of the proposition, or perhaps (T²-F²)+(1/O²-U²) as suitability for inference, or some other variation with two addends and two subtrahends. This formally corresponds to the coquaternion modulus.
[Minor edit: replaced jk with kj above.]
"The fool saith in his heart that there is no empty set. But if that were so, then the set of all such sets would be empty, and hence it would be the empty set." -- Wesley Salmon
- Starglider
- Miles Dyson
- Posts: 8709
- Joined: 2007-04-05 09:44pm
- Location: Isle of Dogs
- Contact:
That's an interesting idea, it's certainly possible. I'm not sure how much use it would be in practice, I'll have to think about it. Probabilistic consistency checking primarily consists of finding sets of propositions that should be mutually inconsistent and checking that (a) their joint probabilities are zero and (b) their independent probabilities sum to one. The correct way to restore consistency when these constraints fail is to explicitly add them as propositions in the probability calculus and do some fairly complex solving to correct the priors, but there are nasty heuristics that usually work ok in practice, such as simply scaling mutually exclusive probabilities proportionally so that they sum to 1.0.Kuroneko wrote:Perhaps instead of of truth and falsity being formally independent,
That's the degenerate case of an inconsistent mutually exclusive set where there are only two possibilities (a statement and its complement). In Boolean logic we just know whether the set is broken or not. You're proposing that there's a value for further inference in knowing 'how broken' the set is. The obvious problem with quantifying that with a single dependent scalar is that we've actually got two constaints, T -> 'F and F -> 'T, which can break down independently. Thus there are four probabilities { p(TF), p(T'F) p('TF), p('T'F) } and three independent values are required to fully specify the nature of the inconsistency. I confess that it isn't immediately obvious to me how to translate the normal probabilistic inference operations to generate meaningful results in this representation when consistency fails. However I am intrigued at the notion that this could potentially be a fourth layer that kicks in where required as a relatively simple and powerful way to handle mutual exclusivity violations (we try to avoid general calculus solving if we possibly can), so I'll keep thinking about it.leading to, e.g., inconsistent as "T&F", with a continuum of values, we could have inconsistency as "1-[Pr(T)+Pr(F)]", with formal independence along each of those dimensions?
I follow that part, but clearly you're going to have to constrain that structure to make it match the actual logic, since the monoticity constraint (without considering the 'out-of-band' inconsistency resolution mechanisms) means that the only legal operators are chains in the state machine ( Ukn ) -> ( Upr )-> ( AsT | AsF ) -> ( Con ) -> ( T | F ) -> ( Ics ). Strictly UNPROVABLE should be a terminal state, but that isn't enforced in the basic logic, that's for the reflective logic to concern itself with.Label the logical values in the following manner: U = unknown (indeterminate), O = inconsistent (overdeterminate), T = true, F = false, and define the operators, in cycle notation: - = (TF)(OU), i = (TOFU), j = (TO)(FU), and either k = (TF), and 1 be the identity. In other words, -1 negates everything, i is negation in the sense of Post, k is negation in the sense of Kleene (only logical part), and j is both. Then i² = -1, j² = +1, k² = +1, kj = i, ki = j, ij = k, and so on. The set S = {±1,±i,±j,±k} forming an eight-element group under composition isomorphic to the dihedral group D_4, of the same order (symmetries of a square).
I can see how that defines two layers which mirrors the 'raw representation', however...The set S forms a basis for the coquaternion algebra [2], and peforming a Cayley-Dickson construction over coquaternions with another root of unity, l² = 1, produces the split-octonion algebra, which is nominally what we want, since there are two "layers" ('pseudo-Boolean' and 'defeasible'), with l as an operator that goes between them.
...I don't understand what you're doing here. I assume you're specifying the precedence of the Boolean layer over the defeasible layer and presumably constraining the set of legal operators somehow to establish the monoticity constraint. I confess I didn't even know you could have a non-symmetric dihedral group (nor had I previously encountered the concept of a Moufang loop, cool name though ).The basis, with negatives, forms a Moufang loop that has a structure of a 'twisted' prismatic dihedral group D_4h = D_4×C_2 (symmetries of the cube considered as a prism), now antiassociative, (xy)z = -x(yz).
- Starglider
- Miles Dyson
- Posts: 8709
- Joined: 2007-04-05 09:44pm
- Location: Isle of Dogs
- Contact:
Sorry, I meant 'mutually exclusive' of course. Similarly;Starglider wrote:Probabilistic consistency checking primarily consists of finding sets of propositions that should be mutually inconsistent
should strictly be 'nybble' (in my defence, most programmers I know do this, along with contemptuously ignoring the notion of 'kibibytes' ). Literally addressing nibbles requires a processor with PAE support (which actually stands for Pizza Addressing Extension).Starglider wrote:However packing at that density would effectively wipe out the gains of using SSE2, because nibbles aren't directly addressable.