Brain Farming, Thoughts

Pyramidal Neurons & Quadruple Store

On a previous article reviewing the book “On Intelligence”, I was mentioning something I found fascinating; the 3-dimensional encoding nature of fundamental sensing such as:

sound: amplitude, frequency and duration
color: luminosity, hue and saturation
pain: intensity, location and spread

I saw this Quora discussion regarding the definition of Ontology about the following paper from Parsa Mirhaji presenting an event driven model in semantic context that could also be encoded in triple store.

If you’ve already stumbled upon semantic web and all its ontological weirdness, you’ve probably come across the concept of RDF triplestore, or Turtle for intimates, which allows you to encode your data using a global schema made of 3 elements:
The subject, the predicate and the object.

This allows you to encode natural language proposition easily. For instance, the sentence “Daddy has a red car” makes Daddy the subject, has the predicate and a red car being the object. As a general rule of thumb, everything coming after the predicate will be considered the object. The subject and object could be contextually interchangeable, which allows deep linking. Also multiple predicate can apply to a subject, even if the object is the same.
“I have a bass guitar” and “I play a bass guitar” are 2 different propositions seeded which have only a different predicate, and it would be more natural (in natural language) to express a reference within the same sentence such as “I play the bass guitar that I have” (although you’ll notice that I inferred the bass guitar is the same, it is a bit of a shortcut)

If I had the idea to convert my “simple” triplestore to a SQL table; I could say the complexity of this triplet relationship is
subject[0..n]->[0..n]predicate[0..n]->[0..n]object
Also, subject and object should be foreign keys pointing to the same table as I and a guitar bass could both be subject or object depending on the context.
If converted to a graph database, a predicate would therefore be a hyperedge, a subject would be a inbound node and an object would be an outbound node. That leaves us with a hypergraph. This is why some, like Grakn, adopted this model. Although, their solution is schemaful so they enforce a structure a priori (which completely contrasts with my expectations regarding Serf that requires an environment to be schemaless and prototype-based, therefore I’ll default to a property graph database, but the guys are nice)

Getting back to the initial picture that triggered this post; I found really interesting to further, or alter, the traditional schema subject-predicate-object to a schema:
subject – event – observation
It seems that, in the first case, we are working with purely descriptive (although really flexible) case while this other schema allows us to work on an action/event base. That is catchy as I couldn’t resist apply my relativism: “Oh well, it’s context-based” and to consider that there should be a bunch of other triplet schemas to be used as fundamental ways of encoding.

Making the parallel with the sensing approach; such as pain and its triplet location-intensity-spread, could I also say that it’s fundamentally just vectors that could be expressed by the peculiar structure of pyramidal neurons?
Those neurons are the largest from the different varieties in the brain and their soma is shaped quite like a pyramid (hence their name) with basal dendrites (input wires) starting usually with 3 primary dendrites (up to 5 in rarer cases). It’s a bit like if 3 input trees met to a single input wire (apical dendrite) in order to produce a single output wire (axon).

The Cerebral Cortex | Neupsy Key
Schema of a pyramidal neuron from NeupsyKey

Of course, it is just conjecture, but the triplet approach of any fundamental information, based on perception or cognition, is really tempting and sexy as it works so well.
Although, if pyramidal neurons are everywhere in the brain, what their location and what they’re connected to really makes a huge difference in their behavior.


And this is why I started to nurture, in a corner of my mind, the idea that we could add a contextual field to pick the right schema. This would act as a first field, as a group in graph representation probably, or just another foreign key in a SQL storage, that selects what the 2 last fields would be about.

In summary, a quadruple store would extend any triplet such as:
Context – <SourceSubject> – <TypeOfRelation> – <TargetSubject>
Where the context solves the other fields type. At least, this idea has been written down and will keep evolving until satisfaction.

Brain Farming, SERF, Thoughts

What’s up for 2019 ?

Tamagoso – an idea that might be refined later

As mentionned in the proposition article, it’d be nice to have it developed as the dependencies of a complex agent. I presented previously the ideas of botchi and the serf layer, which have different but compatible purposes: one to make some AGI-compliant (yeah, I know, the G is too ambitious) UI for an agent, the other is way to process data mimicking what you’ll expect from a turing machines for automaton. I mixed those two ideas at some points into Tamagoso. From “Tamago” meaning egg in japanese, and an obvious reference to 90’s game; and from “so” meaning layer, though there’s an union on the o and I’m not sure of the prununciation)

The idea is then to use an UI based on Tamagotchi design to monitor the agent health, as well as instructing it from different media sources and to apply different set of rules in various contexts. A growable, pluggable, user friendly machine that could be developed up to competing with each others? Couldn’t we think of bot battles on mathematical proofs if given a mathematical reasoning corpus? Or in a street fighter way if given a game environment ? The idea is still the same; evolving high-level agents for multitasking, but in a community friendly way. Reason behind is my belief that the world is hold by, not one, but multiple truths: some reasoning more or less with the populations. Behind the established facts, everyone has a cohesive approach to their interpretations and none can provide all the keys to interpret those. So should the approach be plural.

That’s why this project might be put along the platform proposition. I think we’re still lacking a lot of tools to get there, and a proper way to handle trained neural nets as standard modules is one of them.

Is AGI Architecture a thing ?

I got to make sense of Solomonoff’s induction recently and, while having no wonder for the thing, I ended up with this wonderful interview of Marvin Minski on the top-down approach to AGI after consideration of this induction principle.
And that still resonate in my mind as I ended up considering, a long time ago, that we were more a middle layer from which both top-down and bottom-up approaches were doomed to building further away from the essence of intelligence, instead of dissecting it.

 

A top-down approach, if it’s truly feasible, would be the world of architects specialized in AI. As my craziest dream is to become an AGI architect, developing the AI architecture seems to be the right path.

So could we really take an AGI down to a machine capable of finding the smallest turing machine that fits a task? I’m unsure it’s enough; especially as we know perception is relative to evolutions, and some human-based parametrization will be required for AGI. Though that’s probably more describing the limit of what AI can make: this induction determines the length of the simplest way to get a task done. That’s not human clumsiness and risky shortcuts to get to a “good-enough” result.

So AGI architecture still needs better requirements, but AI architecture could emerge.
Also, I see here, here and there transfer learning trending. Maybe there’s a sign architecture of embedded knowledge is gonna be the next step ?

Platform Embedded Spaces – First use case ?

I would love to start with Tamagoso, but it’s so incredibly freaking difficult and I’m still looking at the mathematical spaces and operators, as well as the concrete cases, we could get out of that. Still documenting, still thinking, I need a simpler case to start something that ambitious progressively. I got a lean canvas, a quick look at the technologies, and that’s it; I’m lost and don’t even know where to start my UMLs.

So maybe a case where I can get more into a trial and error approach would be best. I was thinking about some really 101 agent that can have a nice purpose and it seems we could do something of educational value:
A student might miss some parts of the courses and, while it is unclear when teachers are still discussing notions, it snowballs quickly with advanced tasks depending of those missed fundamentals. A better way to detect it would be to seek for patterns in each school student exercises by casting their mistakes from different source materials to different knowledge spaces.
A hierarchy and classification of those spaces can help to create a reading canvas of the overall student performance. From higher generalization features space, we can retrieve a general student pattern over multiple high-level skills, and get down to the more precise weaknesses. That way we can prevent the student from getting into a snowball failure later from missing notions initially, instead of leaving it undetected until the problem gets raised and it’s too late.

This could therefore be a pedagogic tool, but finding dataset and study cases won’t be easy.

Brain Farming, Thoughts

Proposition for a Platform for Embedded Spaces

This is the initial version of a white paper I’m working on for developing a platform that distributes trained neural networks and grow a new use for them. The idea is that the right approach to make a functional use is to grow traditional symbolic logic on top of the spaces produced by deep connectionist logic. That way, we can express business cases from symbolic logic down to connectionist representations of its states and values.

 

White Paper – a Platform for Embedded Spaces

A proposition to build standards on top of deep neural nets and empower AI architects towards a new wave

An inquiry into the 3rd wave

From the business perspective, there is a first wave, made of expert systems, and a second wave, made of deep learning networks, the rest are considered technologies immature for market, that bubble in the head of AI scientists.

Expert systems were mostly built from huge amount of complex code heavily documented and aimed at reproducing what humans built from experience in a given topic.

Deep learning algorithms are blackboxes that require heavy amount of data carefully prepared. It made a huge leap by ditching complexity over a connectionist approach

The first type is costly to produce, require complex expertise, long waterfall development, is not easy to rewrite and doesn’t easily fit modern development practices : microservices, devops, lean,…

The second cannot be easily implemented by companies ; the system is simple but great results require complexity from features optimization and training, which tend to market deep neural networks as services provided by few specialized companies.

And, while the second wave is a nice improvement over the first one and helped get results into new AI topics, it doesn’t cover all the topics expert systems can treat, like interpretation tasks in natural language processing.

Getting the best of both waves from the business perspective would be to have an AI :

– That can abstract code complexity through connectionnist approach

– That keeps a transparent and modifiable architecture based on symbolic approach

Which is already a difficult task as the the connectionnist approach is pretty random and hard to make sense of (blackbox) while we expect something readible based on intelligible symbols (transparent) to be the access door to the business logic implemented.

On the technical perspective, we can get the following lessons : for the first wave, we need loosely coupled architecture based on interchangeable modules and the ability to spread workload. But there’s no magic solution to reduce business logic complexity when input and behavior require a lot of nuances and tests.
For the second wave, the trained AI market from amateur to small companies is null. The generalization of those networks is poor. But a huge variety of network topologies exist, and a lot of different frameworks to implement them, which makes this wave extremely prolific, but also hard to encompass. Scientific publications also keep booming towards this modern gold rush. But we still lack an obvious ingredient ; even if we solve the training problem, which is not a realistic statement, the absence of standards block us from any communication between machines without a custom semantic interpreter. We ignore the training data sets, the results, the specificities of the trained network and many important information to weight in which trained net is the best to solve a developer issue.

That looks like a dead-end for seeing a spread of trained neural networks in the second wave.

Our proposition

Embedded spaces seem to be the key towards a better communication between the symbolic and the connectionnist logics.

They link together words, styles or items with vectors, like Word2Vec or Style2Vec approaches, and those spaces represent of their own (like a namespace). Though their ability to encode and decode a symbolic value allow them to cast a representation in multiple spaces and find a projection in-between (like multiple implementations of Word2Vec) which lessen the needs for semantic or ontology.

Therefore, we would like to propose a platform, just like a package manager, that will allow data scientists and AI developers to broadcast freely, or eventually to buy and sell, their trained spaces as common langages.

Those will run from containers, as we need a standard approach to encompass the plurality of deep neural nets, and its orchestration can be done on local servers from a modern solution like Kubernetes or OpenShift that will handle the workload and the microservices approach.

From all those spaces, that are actually trained neural nets well defined in a common register, we can establish relationships based on symbolic ; just like a word has different meaning in different contexts. Those contexts being the multiple spaces that can apply consistently in response of an input.

Those spaces are loaded in memory as we need them, balanced by an orchestrator, and that’s how we will use them : as the background of a working memory for machines.
On those spaces, will draw few but rich representation of data to encapsulate all the details related to the given information in its interpretation by the system. Then the working memory unloads its results to standard storage memories or user interfaces.

The point of this approach is, for an app data flow, to have a node where every known and relevant information is available to make the best out of a new information (adjusting interpretation or knowledge base, for instance). To do so, trees, stored on a bus, are drawn on top of those spaces. Those can represent the current world knowledge of an agent, the flow of a press article, the behavior of an individual on cam, etc. Scaling down those trees to patterns, allow the system to pass compressed knowledge or, the other way around, to determine prediction trees from initial patterns and conditions.

Those spaces allow standardization but also nuances to the interpretation, as we transit from programming discrete enum values, for defining states, to programming points on continuous spaces. This embbeds more information than enum but it also produces more possibilities.

Like the well-know Word2Vec result : King – Male + Female = Queen

Designing a langage to use embedded spaces

Going further than those {+,-} operators on vectors, embedded spaces could be further developed to support more subtle operators, like union, intersection, exclusion, or more complex operators like integrals and derivatives. Sets of points could also have different complexities, as their relationship with the symbolic values can be, or not, transitive, reflexive, symmetric, generative, etc.

On another approach, Style2Vec shows us we can embbed different features from the same symbols. Instead of making a space that embbeds shoes and dresses together, we can have a space that embbeds dresses with shoes that match them well.

This bring context nuances : should I group them by style or by function ? Meaning I can use either a space where dresses and shoes go apart, or a space where shoes and dresses get closer as they match better.

At this point, it’s interesting to consider the different use cases. The current consideration is of one agent that centralizes a lot of knowledge while running this as threads of containers managed from Kubernetes. Its large but short memory will allow it to grasp complex knowledge and simulates continuity through complex tasks ; looking consistent to the user. On this continuity, it is expected to plug higher and higher logic schemes to grow a hierarchy of interpretations.

I’m not sure yet what could be the business cases, beside extended current system capabilities, but I’d like to explore it as a bot assembling project. Getting higher behavior, like empathy from casting sentiment analysis in an « emotion space », and developing higher cognition into trees of interconnected spaces that match the multiple contexts from a press article. Most importantly, I’d like to define how it should interact with an user interface, a service layer, a database and a knowledge base.

But, at the end, this will require a common language to make those modules express what they can do, another language to interoperate those modules, and probably even others for the behavior of containers management or BPM scheduling. I still need to grasp a lot from that topic and reduce its apparent complexity, so it’s an ongoing topic that’ll evolve through versioning.

The platform to distribute those embedded spaces could be financed from a percent over purchasable spaces. Docker platform could host the containers for now, and extra information such as licenses and standard api descriptions, as well as the docker container url, should be provided through the platform register.

Brain Farming

LabAware Manifesto is In Progress

Hi guys!

Here’s the deal; the company I made my internship with pushed me to start earlier and on an incredibly difficult project. I like it because it’s challenging and my ability to design large software will grow with but… goddamn I cannot find time for anything anymore !

I did not forget about my promise about the Tamagotchi thing (or Botchi). It’s supposed to be a module from a bigger project I already mentioned as “BrainFarm”. Though it’s kinda commonly used over the web. After scratching my head on it; it became “LabAware” (which I find kinda cool and, most importantly, unique).

Behind, I’m progressing really really slowly on it but it’s a long term project. And then, when people ask me, it’s hard to explain because it’s still a bit of a puzzle in my mind. Therefore, I’m gonna settle a proper manifesto that can be understood widely and, hopefully, be a motivator for some other computer and data scientists to join the approach or, at least, understand it.

I’m gonna try to make it right so it will take time but, good news, the draft version is already ongoing and I’ll share it here with much more details and external links that the final version will have. It will be open for comments before the draft evolves to its final form so, as usual, please give your opinion.

Brain Farming, Thoughts

AIpocalypse

This is ridiculous. The recent fight between Elon Musk and Mark Zuckerberg on AI is incredibly dumb!

elo-musk-and-mark-zuckerberg-july-26-2017

Besides ending by a childish “You don’t know crap about AI, I do”, it’s just the phenomenon AIpocalypse reaching a larger audience. The billionaire Elon Musk isn’t the only one preaching for being careful about AI taking over the world, though he’s the only one calling for public regulation. From the famous futurologue and chief of engineering at Google’s Ray Kurzweil to the Youtuber Robert Miles; the weird idea that classifier requiring hundreds of thousands of pictures to distinguish between cats and dogs could become sentient and elaborate deep secret strategy threatening human kind is spreading.

I keep saying it; the current AI wave has mostly been pushed by marketing, though getting more accurate technics ended up as a profitable consequence. We are far from reaching anything sentient-like. We haven’t even stated the correct question yet. Right now, machine learning is just statistical classification with a bit of logic and those are aimed at a certain goals fueled with limited resources. But those catastrophic scenarii are about sci-fi and expectations. From the point we stand by, right now, it is unrealistic, we don’t even have a full taste of the challenges yet.
Even investigating, from the neurological perspective, the sentient intelligence question isn’t feasible; we keep discovering new dynamics in the brain, such as the recent dendrites independant firing mechanism discovery.

Therefore, the hard-AI question still remains:

Can an individual grows on something else than a human body ?

But here’s the most interesting part: it seems that the AIpocalypse ideas could be based on some actual expectations from a larger audience. But what if we cannot provide this experience with current AI ?

I have no more solution than anyone on the matter of sentient AI but I do think I can provide the tools to, at least, unravel the questions if not solve some of them. Tools to create and develop something that can evolve and get feedbacks from and with user experience.

yellow_stars

Guess what? That’s not the first time we develop simulated intelligence. I mean, since the homonculus of the alchemists, we got way better at making imaginary friends.
Those recent years, we even evolved from the Furby to the Nao in terms of domestic robotic. Hell yeah, we have other options !

In the purely digital realm, we had tamagotchi. Those little eggs growing to be a random creature you could play after it woke you up for poop flooding. Similar were the japanese digimons; a pocket individual monster that you could connect to battle. I could talk about Pokémon but it’s virtual creatures in virtual environment that, while supposedly being pocket monster, won’t fit due to the game boy size.

Anyway, to develo those sentient AIs, maybe we should start by mimicking sentient ? People are more at their ease with an animal-like creature because their expectations are lesser, those tricks should be used to make a better AI interface.
Besides that, it would be really cool to develop a sentient AI if you can have a live feedback from a user friendly interface like a tamagotchi. Or even, though a dream, making online duals on datasets?

The tamagotchi is a really primitive form of AI. By breaking it into simple elements and rebuilding it with machine learning objectives in mind, I think it could be a great asset for a trial-and-error approach of the sentient intelligence problem as well as a great motivator to make its own virtual creature evolves. It’s even really well suited for reinforcement learning as we can have a punish/reward system compatible with tamagotchi philosophy.

At least, this is my objective starting mid-september, hopefully with my MSc. of Engineering in the pocket. Every help is warmly welcomed, we cannot be too much to gamify AI development. The thoughts will come here, in the brain farming category, and the code will go on Github.
If you are interested to join, keep in touch until I learn how to build an open source community.

Until then I got some brainstorming for you.
Let’s say a brain farm is the framework to develop those virtual creatures, but what would be their name ? How would you picture it and its reactions ? Would you experience it as a background task running live on your computer ? What would you be able to make this interface user friendly ? And whatever you got in mind guys, but leave a comment please 😉

Brain Farming

Brain Farming I : Why ?

After I started digging into AI, I knew it was about classifying tree leaves or digits. Though, recent AI projects gave awesome results , it requires expertise and computing power the common people don’t have. I was really all about the model, seeking the next step that could ignite a new era where those functions could be correlated to assemble into a form of being.

I then digged into automatons, neurology, psychology, philosophy, economy, biology,… damn insects can be interesting!
I melt my brain thinking about everything as an intelligent system and ended up swallowing so many things my thoughts were a mess. Those obnoxious thoughts were put away and I started this blog to sort everything clean with description, model and experimentation and to share documentation and personal insights.

I still couldn’t keep it up. When I saw everything as an intelligent model, while lacking the sufficent knowledge to discriminate those ideas, I had too much to write about without being able to focus on a model I clearly saw and could thoroughly test, adjust and compare. This let just drafts accumulate while not producing any new posts and I don’t believe in the quality of the previous posts.

When I started this interest, it was for the funny idea of making small AI, like a smart tamagotchi or a talking virtual assistant, not to adjust weights to fit a labeled dataset heavily pre-treated. When I understood what it was all about, I wasn’t a “singularity believer” and I’m still not, so I had no choice but to resign.

I started to dig in another direction. Even qualifying my ideal job as “brain farmer” for the pun. Though I knew it was a bit desperate as smarter people are paid to work on the issue but I related on a small providential insight or successful model that could be “it”.
Though, as you know, this blog never got any success.

Going Back to the Fun

I had to put aside this hobby for a while. But, recently, it appeared to me I could still manage to focus on this domain and getting some actual results.

The most prevalent tragedy of my approach was to be hardly defined, being congruent with anything and nothing at the same time. I thought more and more of neural nets as pieces of a puzzle to be used plugged into a more general and central interface, AI or hard coded, that could get the most of those network combinations. But how to test and define all the combinations that could work or not and the subsets of neural nets to be used ?

This insight caused me to talk about “brain farming”. Surely, you don’t use the same network for shapes recognition and for words processing, you use probably a CNN and a RNN. I then tried to find a generalized way to generate network topology but even adjusting the hidden layer is already quite hard. I still have some thoughts on matter, such as systems with dynamic internal state generating networks, but it still doesn’t seem to be the right way based on the difficulty of modeling.

Then it seemed more obvious, I had to inspire myself on computes. After all, those networks are just a bunch of memory, coding the internal function, and acquiring processing power from a central unit; the network is processed, no the neurons.

I already accepted the idea of specialization and standardization, as it is a requirement if we want to program those AI, but it looks like it will appear in the way you combine those networks. Therefore growing specialized combination map and processes for achieving a given goal based on processed data flow. A motherboard that is designed to use the neural nets outputs, or other data flows, in the most meaningful way the goal its designed for.

The interest is abstraction. I could program something as weird as:

if(!vision.has{blue line} && !(vision.has{cat face})
    state{joy}.increment();

Where vision and state are 2 functionalities of my motherboard describing the part relative of image and video processing for the first one, and an internal state functionality for the second one.
What’s between brackets {} is a module like a neural network. It’s a black box, I don’t know how it does it but it’s supposed to satisfy the “blue line” or “cat face” slots on my motherboard by reading my data input.

That should be way more fun to program !

Operating on a New World of Data

Though the example is not meaningful, it could definitely empower us. As I’m working on an openCV application, I’m doing again what others have been doing before; learning heavy stuff to identify a simple line or a rectangle.
Why is there no plug-and-play module for that ?

We are already in a world where data and datastream are easy to use but the technology to get meaning out of them is out-of-reach, though some great frameworks or API, we’re still not in a “package manager” era of AI or, in a more general way, data processing.

Therefore, starting from september when my openCV project will be over, I’ll give a focused goal to this blog; studying the problem of designing those motherboards that I’d prefer to call “brains”, in order to become a brain farmer.

Those brains would be generalized in a sense, as every printed board is, while allowing another way to program and use datas. Those are not physical, allowing really complex machine-designed architectures, and our CMOS are whatever you would want to plug-and-play. But, if it’s a machine learning function, it might be already fine tuned and they could exist for so many purposes that designing a package manager system is also an essential step to make those brains useful.

But everything else will come soon…

 

If it gave you any insight or you want to start a discussion on this matter, please do share in the comments.
On my side, I think 2 months will be really long before being able to start working on this idea so, before leaving WordPress for openCV, I’ll give an extra.

Teaser

Being able to pass the neural networks from discrete time to continuous, we’ll need a sample frequency, limited by the computing capability, but its usefulness can vanish as we seek for rarer treats. We can balance the computing power this way but we won’t be able to fade it away while varying the frequency. We need another variable for that so let’s define its boundaries:

  • G (Gatherer): acts purely on planned frequency, transmitting the signal independently of any trigger
  • H (Hunter): acts purely on trigger, it reacts only if the signal is meaningful to be transmitted

Between those, there are several shades of behavior. Though, we have to give a computing reality to those: a G behavior is triggered at clock frequency, the output can therefore be null, but it gives a constant response, its triggered by the system internal clock. While the H behavior is the opposite tendency; its triggered by the data state. To know the data state, either we relate on G units that will switch on our H units, or we have to let both read data at max computing frequency and differentiate response state in a ternary way (null, 0,1).