Thank you for this splendid opportunity. So, I will be talking to you about

Enformation Theory, or “Physical Vitalism”. It’s not as bad as it sounds. Perhaps – or, if you have

no philosophical background, then you don’t know you should be

worried about this kind of a title. So perhaps I just skip it. So, this is me. So, after I … I got … or left my

office, I have been free to do research. And it is an … you will see, it is not bounded

by some traditional practices, I would say. And, this is the result of

those studies, this book. And, I also want to … introduce this Petri

Lievonen who is behind many things. For example now he is behind the camera. And he’s working at this Helsinki

Institute for Information Technology so he is much better accessible to you,

if you are interested in these issues. These are some simulations he has been

carrying out on these algorithms. So, let’s start, about the very,

very basic, or first principles. You know that, when – or in science,

we are today doing empirical science, or we are taking data and

then we try to model it. But the problem today is

that we have so much data. And… There are so many different kinds of models –

we are … totally lost. Lost in that jungle. So, we would like to have some firm background,

to know some basic model structures that we should or could be using. But, the problem is here that if

you make some assumptions about what are good models for real life,

or what the world is like, then it’s philosophy, or it is metaphysics. And that is not very scientific, to start with. But it’s very – I would claim that it’s…

This kind of thinking is fruitful anyway, because you can recognize

that now we have been, kind of, thinking in the very … analytical way, or

in the spirit of analytical philosophy, really. Or in the spirit of early

Wittgensteinian philosophy. So that all things that exist,

they are facts, about reality, and everything can be expressed in formulas. But, later Wittgenstein observed that not

everything was said in that Tractatus. And, really, the interesting

things are just, kind of – they are like, flowing in the

flow of life, as he said. Now, we still have some background here, if we

try to go – or approach things in different way. So, if we start from the assumption that everything is

motion, then we have this process philosophy available. Unfortunately there are not very

many tools available there, but then we can start applying our engineering

tools, or engineering intuition here. And we can simplify this issue of motion,

or change, to it’s very basic constituents. And, in the beginning, all movement,

it’s some kind of vibration, only. Only when it gets coordinated, then you have

some flow – this process philosophical flow. So, let’s start from vibration. And, another assumption is that, because these

dynamics are so much more important, we assume, then all structures and mechanisms,

they are forgotten to begin with. We assume there is no structure

in the world to begin with. So, what can we know about the world –

what is there still available to us, if we have no structure,

and only some vibration. Now, I want to say a few words about this

vitalism. Because if somebody knows about, knows what this vitalism is, it is really trying

to find some forces that pull things towards itself. But now we are not trying to find

purpose, or finalistic explanations. We are just modeling vibration from the beginning.

And, study what happens when it starts coming out. So, this is – if we have no structure, then

we know that the distributions are Gaussian. This is all we can know with

those assumptions we made. And, the traditional starting point

is that we concentrate on the mean. But the problem is that then we end up with

the very normal models about the world, the static world, where you have just

this … some kind of mean values. And, very much, this kind – this

distribution, or this kind of variance, variation around the mean, has been forgotten.

And now it is assumed this variance here, it is not only measurement noise. But it contains

the essence of that motion, or vibration here. And this all becomes more interesting or more

relevant in higher dimensions, of course. So, let’s repeat the starting point. Now we

assume that what we can normally observe, in static view, it is – let’s call it matter. It’s

the result of coarsening, or forgetting about details. But now we define this enformation as

the second moment of the distribution. So, if you want to have it zero mean,

then you have this variance, and if you have it in high dimensions,

then you have correlations or covariations. And this is intuitive motivation for this

study. Because if you assume that everything in the world is some kind of springs, and

the state of a spring, it is the variable z, then you know that the energy of that

spring, it is proportional to z squared. And, if you have this kind of measure for energy

in the world, then you have some kind of semantics, meaning that it is only this kind of energy

that can make changes in the world. And if you know where there is much energy,

then there is much some kind of meaning in that portion of the world. Ok, for some

other audience I would say that this can be intuitively identified with some kind

of technical, or physical vital force. And this… Starting from this kind of a

starting point, you can see that you can skip from how questions to why questions

already, or you can start … studying … more – or some kind of deeper questions. And this is the explanation for … introducing

a new term. Instead of those expectation values, we are now employing that kind of

emergence operator, that big epsilon. Because, that expectation is mathematically –

it is a very difficult concept, because you would need all history, all data,

simultaneously, to determine the expectation value. But this emergence operator, it is – it’s enough

if you know the history just up to this point. And it turns out that if you have different

kinds of frequency bands, there, then it gives rise to different kinds of models,

or some kind of fractality to structures. Now I will just, once more, explain what is

going on here, when we model enformation. Normally, when you do science, or when you do something

with nature, you apply your interpretations from above, so your … kind of semantics, is given

by your mind, given from upside. So you have to do it by hand,

for all items separately. But now, we let this enformation come from below. And then when it comes to the top level,

you can – perhaps, if it’s really relevant – then you can perhaps find also … connections

to mind-level phenomena, or mind-level concepts. So, if we want to have some background here,

we can notice that in cybernetics, for example, Gregory Bateson has defined, that what is

interesting, is differences that make difference. This is enformation, actually. And when we now start getting further, we

need to have some connection for items – or, how things are connected to each other –

and for that purpose we can actually start from, that Bishop Berkeley, saying that to exist, is

to be measured. So this is what we here apply. Because, if you want to have much enformation, if

you want to have … make you mark in the world, then you need to have much

enformation from your environment. And making your mark, it is applying

that enformation in your environment. So, surviving and flourishing – it is the

same thing as being in interconnection, or communicating with your environment. So, what remains, or what best can do this –

it will remain the most visible in the world. So, technically speaking, we can follow

the flow of enformation – optimal flow, and assume that it is the winning

strategy in the world. Ok, then we start modeling that interaction

first. And, here we are … Or, we can take, for example, chemical reactions, as our starting point,

because there you also have this chaos around you, and the net result, or the chemical reaction rate,

it is the result of some kind of collisions there. So, you know that the net

result is linear, actually, so the activity x is a linear function

of … variables in your environment. Actually – the derivation on the previous

slide was only to motivate this linear form, because actually you already know, that if your

enformation is defined as a quadratic function, then you need to have linear formulations to

maximize it. So, this is just the motivation. This can be seen as some kind

of generalized diffusion also. So, now we define the enformation theoretic

system as a set of this kind of x variables, or state variables, that see the

same kind of environment or vector u. And – now we need mathematics to follow this

enformation flow. Because, using natural language you only can … Or you don’t – you are not

able to do dynamic things very well and so on. And what is most important, is that now your

attitude to mathematics kind of changes a bit. Because, typically mathematics is used free of semantics,

you only apply syntactical structures, manipulations, but now, we only follow that enformation, so the

semantics is bound to the structures, to start with. So we are not studying all mathematically possible

structures, only relevant structures, hopefully. So, now it’s easy task, when we start from that

linear formulation, to find the maximum for it, or maximum – or find first the expression for

enformation, you multiply it by x on both sides, and then take the expectation, or that

emergence operator, and it turns out that Lagrange technique gives you this kind of

formula for those a’s – meaning that, finally, the interaction between the system and

the environment is given by this formula, where you have first this Q, a coupling matrix, diagonal,

where these elements tell how tightly coupled the states are. And then you have, this – kind

of semantic filter also there. Let’s skip details here. It’s a rather easy task, now, when you start

studying this formula, and when you study when it is … what kind of stationary – or, what

happens if that x, outside of that operator and inside of that operator, if they are the same.

So, you apply emergence operator to this formula, and you will recognize that … you will have something

that reminds principal component analysis. And indeed, it does principal subspace analysis.

It can be proven that it really concentrates on the most significant of the data … variance

directions, or enformation directions in data, really. And the funny thing is that, if you look at the

formulas, you see that the observable environment gets modified in a very funny way. If the

original eigenvalues of environmental enformation, they are given on that envelope,

then you will have those coupled eigenvalues, having very different structure. The most

important of those eigenvalues get modified. But I will not go to details here. Ok, that enformation thing, it is very

much correlations-based. And it’s – if you go through those Hebbian algorithms, you

notice, that they are also correlations-based, and there you always have the stability problem, so

that the structures are not … stationary or stable. And here, we want to keep structures

linear. And we can keep things linear, when we apply … linear negative feedback

to implement that stabilization. The problem here is of course that when you have

multi-level dynamics, you will have simulation problems. In real life, you will not have such

problems, because, it is … simultaneous – when you use something, it is

implicit that you also exhaust it. But, in simulations, applying information

technology, you’ll need to take into account that … exploitation separately,

and then you have multi-level dynamics. Ok, this is an example of what can result. So, the

red arrows here, they are – this enformation flow. So, if you – the plants, they grow

differently in different years. For example, some plants like

sun. Some plants like moist. It means that in different years they are – there are

special … covariance structures in the plant world. And, it is that structure that gets modeled

by these enformation theoretical models. And it can be claimed, that ecological

lockers are based on those … structures. And it needs to be recognized that …

if you have nonlinearities, then – the basis functions, they get turned towards

sparse components, typically, as simulations show. And the blue arrows here, they show

the normal, visible matter flow – meaning that the levels in plants, or levels in resources,

they are reflected in abundance and biomass on the higher level. But this is not the whole

story, so that it would only – or, the higher level would just reflect the

lower level. Actually, the most important, interesting thing is this backwards arrow, what

the coupling does to this environment, really. And it turns out that it can be called regulation. When you write down the formulas – or those

resulting mappings, you see here the normal phi, it is the mapping from the observable

environment in stable state ū, to the state x̄. And the other phi, varphi, it is the mapping

from the original, undisturbed, u, to x̄. Then you notice that you have very strange dual

symmetry, there. Because you can recognize that these mappings, they are actually – they

take the form of ridge regression, simultaneously. In the different – or, looking … different

way, or when the data goes in different way, it turns out that the principal subspace analysis

is implemented, but in the different way, the same mappings implement … regression. Optimal, in

the sense of enformation capture, or variation capture. So… There’s optimal modeling, and optimal estimation. And when you apply that negative feedback,

it’s optimal control, really. But you can see here that … it is

not really optimal, this regression, it’s only robust, because it’s ridge regression. So

not all enformation is exhausted in the environment. And when you study the behaviour of the

inherited enformation, you can notice that – there is some level of coupling, or that q parameter,

you must have. If the level, if the value of q is lower, then there can be no coupling, and

in steady state, your system dies. But – interesting behaviour starts only after that. You have, now, some – or plenty of free parameters,

but you can also apply the same principles, or optimality principles – you can apply

methods for determining the variables. For example, the system size, or the size of n, it has, some

… optimal value, based on the singular values of the data. …and here is an example of what things look like.

There are some handwritten digits, and then – it is a 32×32 grid – and… This is what happens. If you don’t apply some nonlinearity there, then the results

are less specific – or you have mixed features there. So – it converges to this kind of … outlook. So, this is very much, like – or what you have

seen in different kinds of algorithms, also … so, you can assume that this is something that, perhaps,

the visual cortex also does, on the bottom level. There are … interesting connections to

different kinds of machine learning algorithms. For example, starting with Hopfield nets, you again

have some cost function, or energy function there. And if you take restricted Boltzmann machines,

then you can recognize that you, again, iterate the signals, because you have this

loop, there – and actually, in this case, it implements … black noise, or it … gets rid of

the extra noise there, and it makes modeling easier. Then there are these Hebbian algorithms,

and so on, but specially – if we take multilayer perceptron networks, we can recognize

that there is very interesting difference here. Because typically – you remember these perceptron

networks – there you have the problem that … they are physiologically not very plausible,

because you have to … backpropagate the error. But here, it is only error that is all the

time manipulated on – or these x̄, ū values, they are only … they are errors, feedback errors. So, it is kind of … very different kind

of signal flow diagram that … comes out. You have difficult signal propagation,

but very easy … model adaptation. Because it is directly those signals that

are visible, that are used for adaptation. And – these whirls, there, they will …

come – or we will meet them again, later. Ok, if you assume that there is some evolution, or

something – that if systems want to get more enformation to the higher level without dissipation, then it

turns out that if they can recognize the neighbours – if there’s internal feedback – there’s no need

for feedback from outside, from environment. And then it turns out that you can

implement exact least-squares regression. And, for that kind of models, you can easily

find different interpretations, also. Because, if you define this kind of an energy

function, you can recognize that if you differentiate with respect to x, then you notice that …

the solution is the same as on [previous] slide. And, we started with optimizing – or

maximizing enformation on the low level, and it turns out that when we look from

upside, it minimizes enformation. This is interesting, because there are connections to

different kinds of … macro-level modeling techniques. This optimality criterion can be written in

different forms, also – and, this features-based formulation is interesting, because, here

– if you compare to standard, traditional, maximum likelihood criterion for Gaussian data, you

recognize that this data covariance, essentially … you usually have inverse of data covariance here,

weighting – now we have … direct data covariance. And this makes very much difference – with

respect to what kind of models you have. Because, it turns out that you all the

time concentrate on the freedoms – you are not trying to find the matching of the rigid structure,

you try to get away from the given structures all the time. This makes very much difference

in the interpretations. So… I will skip the rest of those discussions, but I

will just show another kind of open-ended challenge, here. Because, if we start thinking of … what

are the fine details of the dynamics, of how the system converges, then we can recognize

that – if we apply expectation maximization in the form of ensemble Kalman filter –

we can recognize that the best way to adapt those system states, is

this kind of dynamic formulation. Or, here is the dynamic formulation – and, if you continue,

assuming that the inputs to the system, they are other states from other systems – they are also dynamic,

having the same dynamics – you can recognize that you have two … states that feed

each other, but there is minus sign. You can recognize that the

end result is an oscillator. And – this was a non-dissipative model,

so it’s non-vanishing oscillation there. Ok, in the book, there was a mistake

here, because this is only for scalars, but in … normal case you need to

take this … eigenvalue at a time. So, another qualitative leap, is that when

you start studying the frequency domain – when you assume that all systems, they oscillate – you

have some kind of vibration fields, and they interact. You have very different kinds of structures. But the nice point here is that they are, again,

linear, and you have strong tools for them. But this … possibility has

not been studied very much. Ok, finally, there is one catch for you,

because you are information theoreticians. So, what is the difference between enformation

and information? We started from chaos, and we started applying these life processes – I

call them, these enformation theoretical processes, I call them life processes – they start

separating this chaos to noise and model. And when it has exhausted all enformation, it ends up in dead model

structures, where you have plenty of, or high level of Shannon information. And on the other hand, you have high level of

Shannon information in the remaining white noise. So, the last … or, related to the previous slide,

if the system is this kind of living creature that eats that chaos where you have possibility of a model,

and you have … possibility of enformation – it changes to rigid structures and noise. And – it defines the boundary

line between order and chaos. If order is that rigid structures, and

chaos is that … chaos with no model, yet. And, it’s no wonder why living systems, or

complex systems, remain on the boundary line, because they define the boundary

line between order and chaos. So, this book here, that is available now

– it is really about … life, in general. You remember those whirls, in the structures – it

is assumed that this kind of control structures, that are enformation pumps, they are the basic structures

in living organisms – in universal living organisms. And, starting from that … control paradigm –

you can also observe, that the environment – when it gets controlled, optimally – it

turns out, that it goes towards … maximum, or actually, minimum variation, or heat death. So …

in the world of that model, the entropy grows maximally. So, even if the system gets more and more complicated,

it turns out that the environment … becomes better and better controlled, and the –

its entropy level goes higher and higher. So there’s no more, kind of contradiction here – as it usually is

contradictory, that natural systems seem to become more complicated. So, thank you. This was all

… I wanted to show you. But I have plenty of other slides, if you

… are interested in some special issues.

It was a nice surprise to come across this video. Thank you. 🙂

Heikillä jo vähän taitaa hihna luistaa. Mutta onhan tällä edes jonkinlainen viihdearvo 🙂

thanks!!! very interesting!