Enformation Theory: “Physical Vitalism”

Enformation Theory: “Physical Vitalism”


Thank you for this splendid opportunity. So, I will be talking to you about
Enformation Theory, or “Physical Vitalism”. It’s not as bad as it sounds. Perhaps – or, if you have
no philosophical background, then you don’t know you should be
worried about this kind of a title. So perhaps I just skip it. So, this is me. So, after I … I got … or left my
office, I have been free to do research. And it is an … you will see, it is not bounded
by some traditional practices, I would say. And, this is the result of
those studies, this book. And, I also want to … introduce this Petri
Lievonen who is behind many things. For example now he is behind the camera. And he’s working at this Helsinki
Institute for Information Technology so he is much better accessible to you,
if you are interested in these issues. These are some simulations he has been
carrying out on these algorithms. So, let’s start, about the very,
very basic, or first principles. You know that, when – or in science,
we are today doing empirical science, or we are taking data and
then we try to model it. But the problem today is
that we have so much data. And… There are so many different kinds of models –
we are … totally lost. Lost in that jungle. So, we would like to have some firm background,
to know some basic model structures that we should or could be using. But, the problem is here that if
you make some assumptions about what are good models for real life,
or what the world is like, then it’s philosophy, or it is metaphysics. And that is not very scientific, to start with. But it’s very – I would claim that it’s…
This kind of thinking is fruitful anyway, because you can recognize
that now we have been, kind of, thinking in the very … analytical way, or
in the spirit of analytical philosophy, really. Or in the spirit of early
Wittgensteinian philosophy. So that all things that exist,
they are facts, about reality, and everything can be expressed in formulas. But, later Wittgenstein observed that not
everything was said in that Tractatus. And, really, the interesting
things are just, kind of – they are like, flowing in the
flow of life, as he said. Now, we still have some background here, if we
try to go – or approach things in different way. So, if we start from the assumption that everything is
motion, then we have this process philosophy available. Unfortunately there are not very
many tools available there, but then we can start applying our engineering
tools, or engineering intuition here. And we can simplify this issue of motion,
or change, to it’s very basic constituents. And, in the beginning, all movement,
it’s some kind of vibration, only. Only when it gets coordinated, then you have
some flow – this process philosophical flow. So, let’s start from vibration. And, another assumption is that, because these
dynamics are so much more important, we assume, then all structures and mechanisms,
they are forgotten to begin with. We assume there is no structure
in the world to begin with. So, what can we know about the world –
what is there still available to us, if we have no structure,
and only some vibration. Now, I want to say a few words about this
vitalism. Because if somebody knows about, knows what this vitalism is, it is really trying
to find some forces that pull things towards itself. But now we are not trying to find
purpose, or finalistic explanations. We are just modeling vibration from the beginning.
And, study what happens when it starts coming out. So, this is – if we have no structure, then
we know that the distributions are Gaussian. This is all we can know with
those assumptions we made. And, the traditional starting point
is that we concentrate on the mean. But the problem is that then we end up with
the very normal models about the world, the static world, where you have just
this … some kind of mean values. And, very much, this kind – this
distribution, or this kind of variance, variation around the mean, has been forgotten.
And now it is assumed this variance here, it is not only measurement noise. But it contains
the essence of that motion, or vibration here. And this all becomes more interesting or more
relevant in higher dimensions, of course. So, let’s repeat the starting point. Now we
assume that what we can normally observe, in static view, it is – let’s call it matter. It’s
the result of coarsening, or forgetting about details. But now we define this enformation as
the second moment of the distribution. So, if you want to have it zero mean,
then you have this variance, and if you have it in high dimensions,
then you have correlations or covariations. And this is intuitive motivation for this
study. Because if you assume that everything in the world is some kind of springs, and
the state of a spring, it is the variable z, then you know that the energy of that
spring, it is proportional to z squared. And, if you have this kind of measure for energy
in the world, then you have some kind of semantics, meaning that it is only this kind of energy
that can make changes in the world. And if you know where there is much energy,
then there is much some kind of meaning in that portion of the world. Ok, for some
other audience I would say that this can be intuitively identified with some kind
of technical, or physical vital force. And this… Starting from this kind of a
starting point, you can see that you can skip from how questions to why questions
already, or you can start … studying … more – or some kind of deeper questions. And this is the explanation for … introducing
a new term. Instead of those expectation values, we are now employing that kind of
emergence operator, that big epsilon. Because, that expectation is mathematically –
it is a very difficult concept, because you would need all history, all data,
simultaneously, to determine the expectation value. But this emergence operator, it is – it’s enough
if you know the history just up to this point. And it turns out that if you have different
kinds of frequency bands, there, then it gives rise to different kinds of models,
or some kind of fractality to structures. Now I will just, once more, explain what is
going on here, when we model enformation. Normally, when you do science, or when you do something
with nature, you apply your interpretations from above, so your … kind of semantics, is given
by your mind, given from upside. So you have to do it by hand,
for all items separately. But now, we let this enformation come from below. And then when it comes to the top level,
you can – perhaps, if it’s really relevant – then you can perhaps find also … connections
to mind-level phenomena, or mind-level concepts. So, if we want to have some background here,
we can notice that in cybernetics, for example, Gregory Bateson has defined, that what is
interesting, is differences that make difference. This is enformation, actually. And when we now start getting further, we
need to have some connection for items – or, how things are connected to each other –
and for that purpose we can actually start from, that Bishop Berkeley, saying that to exist, is
to be measured. So this is what we here apply. Because, if you want to have much enformation, if
you want to have … make you mark in the world, then you need to have much
enformation from your environment. And making your mark, it is applying
that enformation in your environment. So, surviving and flourishing – it is the
same thing as being in interconnection, or communicating with your environment. So, what remains, or what best can do this –
it will remain the most visible in the world. So, technically speaking, we can follow
the flow of enformation – optimal flow, and assume that it is the winning
strategy in the world. Ok, then we start modeling that interaction
first. And, here we are … Or, we can take, for example, chemical reactions, as our starting point,
because there you also have this chaos around you, and the net result, or the chemical reaction rate,
it is the result of some kind of collisions there. So, you know that the net
result is linear, actually, so the activity x is a linear function
of … variables in your environment. Actually – the derivation on the previous
slide was only to motivate this linear form, because actually you already know, that if your
enformation is defined as a quadratic function, then you need to have linear formulations to
maximize it. So, this is just the motivation. This can be seen as some kind
of generalized diffusion also. So, now we define the enformation theoretic
system as a set of this kind of x variables, or state variables, that see the
same kind of environment or vector u. And – now we need mathematics to follow this
enformation flow. Because, using natural language you only can … Or you don’t – you are not
able to do dynamic things very well and so on. And what is most important, is that now your
attitude to mathematics kind of changes a bit. Because, typically mathematics is used free of semantics,
you only apply syntactical structures, manipulations, but now, we only follow that enformation, so the
semantics is bound to the structures, to start with. So we are not studying all mathematically possible
structures, only relevant structures, hopefully. So, now it’s easy task, when we start from that
linear formulation, to find the maximum for it, or maximum – or find first the expression for
enformation, you multiply it by x on both sides, and then take the expectation, or that
emergence operator, and it turns out that Lagrange technique gives you this kind of
formula for those a’s – meaning that, finally, the interaction between the system and
the environment is given by this formula, where you have first this Q, a coupling matrix, diagonal,
where these elements tell how tightly coupled the states are. And then you have, this – kind
of semantic filter also there. Let’s skip details here. It’s a rather easy task, now, when you start
studying this formula, and when you study when it is … what kind of stationary – or, what
happens if that x, outside of that operator and inside of that operator, if they are the same.
So, you apply emergence operator to this formula, and you will recognize that … you will have something
that reminds principal component analysis. And indeed, it does principal subspace analysis.
It can be proven that it really concentrates on the most significant of the data … variance
directions, or enformation directions in data, really. And the funny thing is that, if you look at the
formulas, you see that the observable environment gets modified in a very funny way. If the
original eigenvalues of environmental enformation, they are given on that envelope,
then you will have those coupled eigenvalues, having very different structure. The most
important of those eigenvalues get modified. But I will not go to details here. Ok, that enformation thing, it is very
much correlations-based. And it’s – if you go through those Hebbian algorithms, you
notice, that they are also correlations-based, and there you always have the stability problem, so
that the structures are not … stationary or stable. And here, we want to keep structures
linear. And we can keep things linear, when we apply … linear negative feedback
to implement that stabilization. The problem here is of course that when you have
multi-level dynamics, you will have simulation problems. In real life, you will not have such
problems, because, it is … simultaneous – when you use something, it is
implicit that you also exhaust it. But, in simulations, applying information
technology, you’ll need to take into account that … exploitation separately,
and then you have multi-level dynamics. Ok, this is an example of what can result. So, the
red arrows here, they are – this enformation flow. So, if you – the plants, they grow
differently in different years. For example, some plants like
sun. Some plants like moist. It means that in different years they are – there are
special … covariance structures in the plant world. And, it is that structure that gets modeled
by these enformation theoretical models. And it can be claimed, that ecological
lockers are based on those … structures. And it needs to be recognized that …
if you have nonlinearities, then – the basis functions, they get turned towards
sparse components, typically, as simulations show. And the blue arrows here, they show
the normal, visible matter flow – meaning that the levels in plants, or levels in resources,
they are reflected in abundance and biomass on the higher level. But this is not the whole
story, so that it would only – or, the higher level would just reflect the
lower level. Actually, the most important, interesting thing is this backwards arrow, what
the coupling does to this environment, really. And it turns out that it can be called regulation. When you write down the formulas – or those
resulting mappings, you see here the normal phi, it is the mapping from the observable
environment in stable state ū, to the state x̄. And the other phi, varphi, it is the mapping
from the original, undisturbed, u, to x̄. Then you notice that you have very strange dual
symmetry, there. Because you can recognize that these mappings, they are actually – they
take the form of ridge regression, simultaneously. In the different – or, looking … different
way, or when the data goes in different way, it turns out that the principal subspace analysis
is implemented, but in the different way, the same mappings implement … regression. Optimal, in
the sense of enformation capture, or variation capture. So… There’s optimal modeling, and optimal estimation. And when you apply that negative feedback,
it’s optimal control, really. But you can see here that … it is
not really optimal, this regression, it’s only robust, because it’s ridge regression. So
not all enformation is exhausted in the environment. And when you study the behaviour of the
inherited enformation, you can notice that – there is some level of coupling, or that q parameter,
you must have. If the level, if the value of q is lower, then there can be no coupling, and
in steady state, your system dies. But – interesting behaviour starts only after that. You have, now, some – or plenty of free parameters,
but you can also apply the same principles, or optimality principles – you can apply
methods for determining the variables. For example, the system size, or the size of n, it has, some
… optimal value, based on the singular values of the data. …and here is an example of what things look like.
There are some handwritten digits, and then – it is a 32×32 grid – and… This is what happens. If you don’t apply some nonlinearity there, then the results
are less specific – or you have mixed features there. So – it converges to this kind of … outlook. So, this is very much, like – or what you have
seen in different kinds of algorithms, also … so, you can assume that this is something that, perhaps,
the visual cortex also does, on the bottom level. There are … interesting connections to
different kinds of machine learning algorithms. For example, starting with Hopfield nets, you again
have some cost function, or energy function there. And if you take restricted Boltzmann machines,
then you can recognize that you, again, iterate the signals, because you have this
loop, there – and actually, in this case, it implements … black noise, or it … gets rid of
the extra noise there, and it makes modeling easier. Then there are these Hebbian algorithms,
and so on, but specially – if we take multilayer perceptron networks, we can recognize
that there is very interesting difference here. Because typically – you remember these perceptron
networks – there you have the problem that … they are physiologically not very plausible,
because you have to … backpropagate the error. But here, it is only error that is all the
time manipulated on – or these x̄, ū values, they are only … they are errors, feedback errors. So, it is kind of … very different kind
of signal flow diagram that … comes out. You have difficult signal propagation,
but very easy … model adaptation. Because it is directly those signals that
are visible, that are used for adaptation. And – these whirls, there, they will …
come – or we will meet them again, later. Ok, if you assume that there is some evolution, or
something – that if systems want to get more enformation to the higher level without dissipation, then it
turns out that if they can recognize the neighbours – if there’s internal feedback – there’s no need
for feedback from outside, from environment. And then it turns out that you can
implement exact least-squares regression. And, for that kind of models, you can easily
find different interpretations, also. Because, if you define this kind of an energy
function, you can recognize that if you differentiate with respect to x, then you notice that …
the solution is the same as on [previous] slide. And, we started with optimizing – or
maximizing enformation on the low level, and it turns out that when we look from
upside, it minimizes enformation. This is interesting, because there are connections to
different kinds of … macro-level modeling techniques. This optimality criterion can be written in
different forms, also – and, this features-based formulation is interesting, because, here
– if you compare to standard, traditional, maximum likelihood criterion for Gaussian data, you
recognize that this data covariance, essentially … you usually have inverse of data covariance here,
weighting – now we have … direct data covariance. And this makes very much difference – with
respect to what kind of models you have. Because, it turns out that you all the
time concentrate on the freedoms – you are not trying to find the matching of the rigid structure,
you try to get away from the given structures all the time. This makes very much difference
in the interpretations. So… I will skip the rest of those discussions, but I
will just show another kind of open-ended challenge, here. Because, if we start thinking of … what
are the fine details of the dynamics, of how the system converges, then we can recognize
that – if we apply expectation maximization in the form of ensemble Kalman filter –
we can recognize that the best way to adapt those system states, is
this kind of dynamic formulation. Or, here is the dynamic formulation – and, if you continue,
assuming that the inputs to the system, they are other states from other systems – they are also dynamic,
having the same dynamics – you can recognize that you have two … states that feed
each other, but there is minus sign. You can recognize that the
end result is an oscillator. And – this was a non-dissipative model,
so it’s non-vanishing oscillation there. Ok, in the book, there was a mistake
here, because this is only for scalars, but in … normal case you need to
take this … eigenvalue at a time. So, another qualitative leap, is that when
you start studying the frequency domain – when you assume that all systems, they oscillate – you
have some kind of vibration fields, and they interact. You have very different kinds of structures. But the nice point here is that they are, again,
linear, and you have strong tools for them. But this … possibility has
not been studied very much. Ok, finally, there is one catch for you,
because you are information theoreticians. So, what is the difference between enformation
and information? We started from chaos, and we started applying these life processes – I
call them, these enformation theoretical processes, I call them life processes – they start
separating this chaos to noise and model. And when it has exhausted all enformation, it ends up in dead model
structures, where you have plenty of, or high level of Shannon information. And on the other hand, you have high level of
Shannon information in the remaining white noise. So, the last … or, related to the previous slide,
if the system is this kind of living creature that eats that chaos where you have possibility of a model,
and you have … possibility of enformation – it changes to rigid structures and noise. And – it defines the boundary
line between order and chaos. If order is that rigid structures, and
chaos is that … chaos with no model, yet. And, it’s no wonder why living systems, or
complex systems, remain on the boundary line, because they define the boundary
line between order and chaos. So, this book here, that is available now
– it is really about … life, in general. You remember those whirls, in the structures – it
is assumed that this kind of control structures, that are enformation pumps, they are the basic structures
in living organisms – in universal living organisms. And, starting from that … control paradigm –
you can also observe, that the environment – when it gets controlled, optimally – it
turns out, that it goes towards … maximum, or actually, minimum variation, or heat death. So …
in the world of that model, the entropy grows maximally. So, even if the system gets more and more complicated,
it turns out that the environment … becomes better and better controlled, and the –
its entropy level goes higher and higher. So there’s no more, kind of contradiction here – as it usually is
contradictory, that natural systems seem to become more complicated. So, thank you. This was all
… I wanted to show you. But I have plenty of other slides, if you
… are interested in some special issues.

3 Replies to “Enformation Theory: “Physical Vitalism””

Add a Comment

Your email address will not be published. Required fields are marked *