𝗙𝗿𝗼𝗺 𝗘𝗾𝘂𝗮𝘁𝗶𝗼𝗻𝘀 𝘁𝗼 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲: 𝗪𝗼𝗿𝗹𝗱 𝗠𝗼𝗱𝗲𝗹𝘀 𝗼𝗳 𝗡𝗮𝘁𝘂𝗿𝗲

Maksym Zubkov
1 day ago
2 min read

Mathematics is a foundational universal language that allows us to solve the mysteries behind the laws of the universe. One of the main challenges of modern mathematicians is to look within—to decode how the human mind works, what defines us as living beings.

A possible way to answer this mystery is to imitate the foundational blocks of our brain, to imitate the neurons that carry simple 0s and 1s. With ChatGPT's recent boom, the unbounded desire for profit of giants of capitalism resulted in embedding AI in every part of our lives. How much confidence do we have in this omnipresent AI? Would you entrust ChatGPT to control the airplane that takes your family from Vancouver to San Francisco? In contrast to massively used probabilistic methods, my research attempts to bring a deeper understanding of how neural networks work using algebro-geometric ones.

We create a neural network by picking a specific architecture and assigning parameters to it. You can imagine that you are creating a being in some sense. But, as with any being, it needs a purpose, a place to reside. The space where the neural network lives, we will call the ambient space. By changing the parameters of the network, we allow the network to travel through the space to reach its goal. For example, if we want to teach a neural network to distinguish dogs from cats in images, we need to train the network by feeding it many examples of what is a dog and what is a cat. Check the correctness of its output, tune the network to get more precise outcomes, and repeat the process.

The current AI boom has succeeded only because we throw massive resources into compute. But how come that the brain of a six-year-old child consuming only 20 watts outperform the massive AI systems that consume the energy of entire cities with a simple task such as tying their shoe with a bow?

Now imagine that for a fixed architecture, we will allow the network to take all possible paths. All these paths will describe some high-dimensional shape. My goal is to understand the geometry of this shape as it determines the training process and the performance of the network.

Can you grow a tree from a seed planted in a desert? Probably not. Of course, you could do it by supplying the seed with all the necessary conditions: shade, constant watering, the right soil and humidity. Yet, as soon as you leave the desert, the tree will die. In other words my research feels like searching in complete darkness for the best landscape to grow this invisible tree while I have in possession only my pure intuition.

Comments