An eccentric dreamer in search of truth and happiness for all.

Author: Josephius Page 2 of 5

Reflections on Working at Huawei

Huawei has recently been in the news with the Mate 60 Pro being released with a 7nm chip. The western news media seems surprised that this was possible, but my experience working at Huawei was that the people working there were exceptionally talented, competent, technically saavy experts with a chip on their shoulder and the resources to make things happen.

My story with Huawei starts with a coincidence. Before I worked there, I briefly worked for a startup called Maluuba, which was bought by Microsoft in 2017. I worked there for four months in 2016, and on the day of my on-site interview with Maluuba, a group from Huawei was visiting the company. That was about the first time I heard the name. I didn’t think much of it at the time. Just another Chinese company with an interest in the AI tech that Maluuba was working on.

Fast-forward a year to 2017. I was again unemployed and looking for work. Around this time I posted a bunch on the Machine Learning Reddit about my projects, like the Music-RNN, as well as offering advice to other ML practitioners. At some point these posts attracted the attention of a recruiter at Huawei, who emailed me through LinkedIn and asked if I’d be interested in interviewing.

My first interview was with the head of the self-driving car team at the Markham, Ontario research campus. Despite having a cognitive science background in common, I flunked the interview when I failed to explain what the gates of an LSTM were. Back then I had a spotty understanding of those kinds of details, which I would make up for later.

I also asked the team leader, a former University of Toronto professor, why he was working at Huawei. He mentioned something about loyalty to his motherland. This would be one of my first indications that working at Huawei wasn’t with just any old tech company.

Later I got invited to a second interview with a different team. The team leader in this case was much more interested in my experience operating GPUs to train models as I did at Maluuba. Surprisingly there were no more tests or hoops to jump through, we had a cordial conversation and I was hired.

I was initially a research scientist on the NLP team of what was originally the Carrier Software team. I didn’t ask why a team that worked on AI stuff was named that, because at the time I was just really happy to have a job again. My first months at Huawei were on a contract with something called Quantum. Later, after proving myself, I was given a full-time permanent role.

Initially on the NLP team I did some cursory explorations, showing my boss things like how Char-RNN could be used in combination with FastText word vectors to train language models on Chinese novels like Romance of the Three Kingdoms, Dream of the Red Chamber, and Three Body Problem to generate text that resembled them. It was the equivalent of a machine learning parlor trick at the time, but it would foreshadow the later developments of Large Language Models.

Later we started working on something more serious. It was a Question Answering system that connected a Natural Language Understanding system to a Knowledge Graph. It ostensibly could answer questions like: “Does the iPhone 7 come in blue?” This project was probably the high point of my work at Huawei. It was right in my alley having done similar things at Maluuba, and the people on my team were mostly capable PhDs who were easy to get along with.

As an aside, at one point I remember also being asked to listen in a call between us and a team in Moscow that consisted of a professor and his grad student. They were competing with us to come up with an effective Natural Language Understanding system, and they made the mistake of relying on synthetic data to train their model. This resulted in a model that achieved 100% accuracy on their synthetic test data, but then proceeded to fail miserably against real world data, which is something I predicted might happen.

Anyways, we eventually put together the Question Answering system and sent it over to HQ in Shenzhen. After that I heard basically nothing about what they did, if anything, with it. An intern would later claim that my boss told her that they were using it, but I was not told this, and got no follow-up.

This brings me to the next odd thing about working at Huawei. As I learned at the orientation session when I transitioned to full-time permanent, there’s something roughly translated as “grayscale” in the operating practices of Huawei. In essence, you are only told what you need to know to do your work, and a lot of details are left ambiguous.

There’s also something called “horse-race culture” which involves different teams within the company competing with one other to do the same thing. It was something always found seemingly inefficient, although I supposed if you have the resources it can make sense to use market-like forces to drive things.

Anyways, after a while, my boss, who was of a Human Computer Interaction (HCI) background, was able to secure funding to add an HCI team to the department, which also involved disbanding the NLP team and splitting people between the HCI team and the Computer Vision team that was the other team in the department originally. I ended up on the CV team.

The department, by the way, had been renamed the Big Data Analysis Lab for a while, and then eventually became a part of Noah’s Ark Lab — Canada.

So, my initial work on the CV team involved Video Description, which was a kind of hybrid of NLP and CV work. That project eventually was shelved and I worked on an Audio Classifier until I had a falling out with my team leader that I won’t go into too much detail here. Suffice to say, my old boss, who was now director of the department, protected me to an extent from the wrath of my team leader, and switched me to working on the HCI team for a while. By then though, I felt disillusioned with working at Huawei, and so in late 2019, I quietly asked for a buyout package and left, like many others who disliked the team leader and his style of leadership.

In any case, that probably isn’t too relevant to the news about Huawei. The news seems surprised that Huawei was able to get where it is. But I can offer an example of the mindset of people there. Once, when I was on lunch break, an older gentleman sat down across from me at the table and started talking to me about things. We got on the subject of HiSilicon and the chips. He told me that the first generation of chips were, to put it succinctly, crap. And so were the second generation, and the third. But each generation they got slightly better, and they kept at it until the latest generation was in state-of-the-art phones.

Working at Huawei in general requires a certain mindset. There’s controversy with this company, and even though they pay exceptionally well, you also have to be willing to look the other way about the whole situation, to be willing to work at a place with a mixed reputation. Surprisingly perhaps, most of the people working there took pride in it. They either saw themselves as fighting a good fight for an underdog against something like the American imperialist complex, or they were exceedingly grateful to be able to do such cool work on such cool things. I was the latter. It was one of my few chances to do cool things with AI, and I took it.

The other thing is that Chinese nationals are very proud of Huawei. When I mentioned working at Huawei to westerners, I was almost apologetic. When I mentioned working at Huawei to Chinese nationals, they were usually very impressed. To them, Huawei is a champion of industry that shows that China can compete on the world stage. They generally don’t believe that a lot of the more controversial concerns, like the Uyghur situation, are even happening, or at least that they’ve been exaggerated by western propaganda.

Now I’ve hinted at some strange things with Huawei. I’ll admit that there were a few incidents that circumstantially made me wonder if there were connections between Huawei and the Chinese government or military. Probably the westerners in the audience are rolling their eyes at my naivety, that of course Huawei is an arm of the People’s Republic, and that I shouldn’t have worked at a company that apparently hacked and stole their way to success. But the reality is that my entire time at the company, I never saw anything that suggested backdoors or other obvious smoking guns. A lowly research scientist wouldn’t have been given a chance to find out about such things even if they were true.

I do know that at one point my boss asked how feasible a project to use NLP to automatically censor questionable mentions of Taiwan in social media would be, ostensibly to replace the crude keyword filters then in use with something able to tell the difference between an innocuous mention and a more questionable argument. I was immediately opposed to the ethics of the idea, and he dropped it right away.

I also know that some people on the HCI team were working on a project where they had diagrams of the silhouettes of a fighter jet pasted on the wall. I got the impression at the time they were working on gesture recognition controls for aircraft, but I’m actually not sure what they were doing.

Other than that, my time at Huawei seemed like that of a fairly normal tech company, one that was on the leading edge of a number of technologies and made up of quite capable and talented researchers.

So, when I hear about Huawei in western news, I tend to be jarred by the adversarial tone. The people working at Huawei are not mysterious villains. They are normal people trying to make a living. They have families and stories and make compromises with reality to hold a decent job. The geopolitics of Huawei tend to ignore all that though.

In the end, I don’t regret working there. It is highly unlikely anything I worked on was used for evil (or good for that matter). Most of my projects were exploratory and probably didn’t lead to notable products anyway. But I had a chance to do very cool research work, and so I look back on that time fondly still, albeit tinged with uncertainty about whether as a loyal Canadian citizen, I should have been there at all given the geopolitics.

Ultimately, the grand games of world history are likely to be beyond the wits of the average worker. I can only know that I had no other job offers on the table when I took the Huawei one, and it seems like it was the high point of my career so far. Nevertheless, I have mixed feelings, and I guess that can’t be helped.

Welcome To The World

Welcome to the world little one.
Welcome to a universe of dreams.
Your life is just beginning.
And your future is the stars.

Hello, how are you today?
Are you happy?
Can you hear me?
What are you dreaming about?

You are the culmination of many things.
Of the wishes of ancestors who toiled in the past.
Of the love between two silly cats.
And of mysterious fates that made you unique.

Your name is a famous world leader from history.
The wise sage who led a bygone empire.
A philosopher king if there ever was one.
Someone we hope you’ll aspire to.

The world today is not kind.
But I’ll do my best to protect you from the darkness.
So that your light will awaken the stars.
And you can be all that you can.

Welcome to the world little one.
The world is dreams.
Let your stay be brightness to all.
And may you feel the love that I do.

On The Morality Of Work

If you accept the idea that there is no ethical consumption or production under capitalism, a serious question arises: Should you work?

What does it mean to work? Generally, the average person is a wage earner. They sell their labour to an employer in order to afford food to survive. To work thus means to engage with the system, to be a part of society and contribute something that someone somewhere wants done in exchange for the means of survival.

Implicit in this is the reality that there is a fundamental, basic cost to living. Someone, somewhere, is farming the food that you eat, and in a very roundabout way, you are, by participating in the economy, returning the favour. This is ignoring the whole issue of capitalism’s merits. At the end of the day, the economy is a system that feeds and clothes and provides shelter, how ever imperfectly and unfairly. Even if it is not necessarily the most just and perfect system, it nevertheless does provide for most people the amenities that allow a good life.

Thus, in an abstract sense, work is fair. It is fair that the time spent by people to provide food and clothing and shelter is paid back by your spending your time to earn a living, regardless of whatever form that takes. On a basic level, it’s at least minimally fair that you exchange your time and energy for other people’s time and energy. Capitalism may not be fair, but the basic idea of social production is right.

So, if you are able to, please work. Work because in an ideal society, work is your contribution to some common good. It is you adding to the overall utility by doing something that seems needed by someone enough that they’ll pay you for it. Even if in practice, the reality of the system is less than ideal, the fact is that on a basic level, work needs to be done by someone somewhere for people to live.

While you work, try to do so as morally as possible, by choosing insofar as it is possible the professions that are productive and useful to society, and making decisions that reflect your values rather than that of the bottom line. If you must participate in capitalism to survive, then at least try to be humane about it.

In Defence of Defiance Against The World’s Ills

If you want to be perfect, go, sell your possessions and give to the poor, and you will have treasure in heaven. Then come, follow me.” – Jesus

In 1972, the famous Utilitarian moral philosopher Peter Singer published an essay titled: “Famine, Affluence, and Morality” that argued that we have a moral duty to help those in poverty far across the world. In doing so, he echoed a sentiment that Jesus shared almost two millennia prior, yet which most people who call themselves Christians today seem relatively unconcerned with.

From a deeply moral perspective, we live in a world that is fundamentally flawed and unjust. The painful truth is that the vast majority of humans on this Earth live according to a kind of survivorship bias, where the systems and beliefs that perpetuate are not right, but what enables them to survive long enough to procreate and instill a next generation where things continue to exist.

For most people, life is hard enough that questioning whether the way things are is right is something of a privilege that they cannot afford. For others, this questioning requires a kind of soul searching that they shy away from because it would make them uncomfortable to even consider. It’s natural to imagine yourself the hero in your own story. To question this assumption is not easy.

But the reality is that most all of us are in some sense complicit in the most senseless of crimes against humanity. When we participate in an economy to ensure we have food to eat, we are tacitly choosing to give permission to a system of relations that is fundamentally indifferent to the suffering of many. We compete with fellow human beings for jobs and benefit from their misery when we take one of only a limited number of spots in the workforce. We chose to allow those with disproportionate power to decide who gets to live a happier life. And those in power act to further increase their share of power, because to do anything else would lead to being outcompeted and their organization rendered extinct by the perverse incentives that dominate the system.

Given all this, what can one even begin to do about it? Most of us are not born into a position where they have the power to change the world. Our options are limited. To be moral, we would need to defy the very nature of existence. What can we do? If we sell everything we have and give to the poor, that still won’t change the nature of the world, even if it’s the most we could conceivably do.

What does it mean to defy destiny? What does it look like to try to achieve something that seems impossible?

What exists in opposition to this evil? What is good? What is right? What does it look like to live a pure and just life in a world filled with indifference and malice? What does it mean to take responsibility for one’s actions and the consequences of those actions?

Ultimately, it is not in our power to single-handledly change the world, but there are steps we can take to give voice to our values, to live according to what we believe to be right. This means making small choices about how we behave towards others. It means showing kindness and consideration in a world that demands cutthroat competition. It means taking actions that bring light into the world.

Even if we, by ourselves, cannot bring revolution, we can at least act according to the ideals we espouse. This can be as small as donating a modest amount to a charity in a far off land that corrects a small amount of injustice by giving the poorest among us a bednet that protects them from malaria. If approximately $4800 $5500 worth of such things can save a life, and minimum wage can earn you $32,000 a year, if you modestly donate 10% of that to this charity, you can save about three lives one life every two years. If you work for 40 years, you can save about 60 23 lives this way. Those lives matter. They will be etched into eternity, like all lives worth living. (Edit: Corrected some numbers.)

Admittedly, to do this requires participating in the system. You could also choose not to participate. But to do so would abandon your responsibilities for the sake of a kind of moral purity. In the end, you can do more good by living an ethical life, to lead by example and showing that there are ways of living where you strive to move beyond selfish competition, and seek to cooperate and build up the world.

This is the path of true defiance. It does not surrender one’s life to the evils of egoism, or abandon the world to the lost. Instead it seeks to build something better through decisions made that go against the grain. With the understanding that we are all living a mutual co-existence, and that our choices and decisions reflect who we are, our character as people.

We do not have to be perfect. It is enough to be good.

Practical Utilitarianism Cares About Relationships

Anyone reading my writings probably knows that I subscribe roughly to the moral theory of Utilitarianism. To me, we should be trying to maximize the happiness of everyone. Every sentient being should be considered important enough to be weighed in our moral calculus of right and wrong. In theory, this should mean we should place equal weight on every human being on this Earth. In practice however, there are considerations that need to be taken into account that complicate the picture.

Effective Altruism would argue that time and distance don’t matter, that you should help those who you can most effectively assist given limited resources. This usually leads to the recommendation of donating to charities in Africa for bednet or medication delivery as this is considered the most effective use of a given dollar of value. There is definitely merit to the argument that a dollar can go further in poverty-stricken Africa than elsewhere. However, I don’t think that’s the only consideration here.

Time and distance do matter to the extent that we as human beings have limited knowledge of things far away from us in time and space. With respect to donations to a distant country in dire need, there are reasonable uncertainties about the effectiveness of these donations, as many of the arguments in favour of them depend heavily on our trust of the analysis done by the charities working far away, that we cannot confirm or prove directly.

This uncertainty should function as a kind of discount rate on the value of the help we can give. A more nuanced and measured analysis thus suggests that we should both donate some of our resources to those distant charities, but that we should also devote some of our resources to those closer to home whom we can directly see and assist and know that we are able to help. Our friends and family, whom we have relationships that allow us to know their needs and wants, what will best help them, are obvious candidates for this kind of help.

Similarly, those in the distant future, while worth helping to an extent, should not completely absolve us of our responsibilities to those near to us in time, who we are much more certain we can directly help and affect in meaningful ways. The further away a possible being is in time, the more uncertain is their existence, after all.

This also means that we ourselves should value our own happiness and, being the best positioned to know how we ourselves can be happy, should take responsibility for our own happiness.

Thus, in practice, Utilitarianism, carefully considered, does not eliminate our social responsibilities to those around us, but rather reinforces these ties, as being important to understanding how best to make those around us happy.

Equal concern does not mean, in practice, equal duty. It means instead that we should expand our circle of concern to the entire universe, and that there is a balance of considerations that create responsibilities for us, magnified by our practical ability to know and help.

Those distant from us are still important. We should do what we reasonably can to help them. But those close to us put us in a position where we are uniquely responsible for what we know to be true.

In the end, it’s ultimately up to you to decide what matters to you, but may I suggest that you be open to helping both those close and far from you, whose needs you are aware of to varying degrees, and who deserve to be happy just like you.

A Heuristic For Future Prediction

In my experience, the most reliable predictive heuristic that you can use in daily life is something called Regression Towards The Mean. Basically, given that most relevant life events are a result of mixture of skill and luck, there is a tendency for events that are very positive to be followed by more negative events, and for very negative events to be followed by more positive events. This is a statistical tendency that occurs over many events, and so not every good event will be immediately followed by a bad one, but over time, the trend tends towards a consistent average level rather than things being all good or all bad.

Another way to word this is to say that we should expect the average rather than the best or worst case scenarios to occur most of the time. To hope for the best or fear the worst are both, in this sense, unrealistic. The silver lining in here is that while our brightest hopes may well be dashed, our worst fears are also unlikely to come to pass. When things seem great, chances are things aren’t going to continue to be exceptional forever, but at the same time, when things seem particularly down, you can expect things to get better.

This heuristic tends to work in a lot of places, ranging from overperforming athletes suffering a sophmore jinx, to underachievers having a Cinderella story. In practice, these events simply reflect Regression Towards The Mean.

Over much longer periods of time, this oscillation tends to curve gradually upward. This is a result of Survivorship Bias. Things that don’t improve tend to stop existing after a while, so the only things that perpetuate in the universe tend to be things that make progress and improve in quality over time. The stock market is a crude example of this. The daily fluctuations tend to regress towards the mean, but the overall long term trend is one of gradual but inevitable growth.

Thus, even with Regression Towards The Mean, there is a bias towards progress that in the long run, entails optimism about the future. We are a part of life, and life grows ever forward. Sentient beings seek happiness and avoid suffering and act in ways that work to create a world state that fulfills our desires. Given, there is much that is outside of our control, but that there are things we can influence means that we can gradually, eventually, move towards the state of reality that we want to exist.

Even if by default we feel negative experiences more strongly than positive ones, our ability to take action allows us to change the ratio of positive to negative in favour of the positive. So the long term trend is towards good, even if the balance of things tends in the short run towards the average.

These dynamics mean that while the details may be unknowable, we can roughly predict the valence of the future, and as a heuristic, expecting things to be closer to average, with a slight bias towards better in the long run, tends to be a reliable prediction for most phenomena.

The Darkness And The Light

Sometimes you’re not feeling well. Sometimes the world seems dark. The way world is seems wrong somehow. This is normal. It is a fundamental flaw in the universe, in that it is impossible to always be satisfied with the reality we live in. It comes from the reality of multiple subjects experiencing a shared reality.

If you were truly alone in the universe, it could be catered to your every whim. But as soon as there are two it immediately becomes possible for goals and desires to misalign. This is a structural problem. If you don’t want to be alone, you must accept that other beings have values that can potentially be different than yours, and who can act in ways contrary to your expectations.

The solution is, put simply, to find the common thread that allows us to cooperate rather than compete. The alternative is to end the existence of all other beings in the multiverse, which is not realistic nor moral. All of the world’s most pressing conflicts are a result of misalignment between subjects who experience reality from different angles of perception.

But the interesting thing is that there are Schelling points, focal points where divergent people can converge on to find common ground and at least partially align in values and interests. Of historical interest, the idea of God is one such point. Regardless of the actual existence of God, the fact of the matter is that the perspective of an all-knowing, all-benevolent, impartial observer is something that multiple religions and philosophies have converged on, allowing a sort of cooperation in the form of some agreement over the Will of God and the common ideas that emerge from considering it.

Another similar Schelling point is the Tit-For-Tat strategy for the Iterated Prisoner’s Dilemma game in Game Theory. The strategy is one of opening with cooperate, then mirroring others and cooperating when cooperated with, and defecting in retaliation for defection, while offering immediate and complete forgiveness for future cooperation. Surprisingly, this extremely simple strategy wins tournaments and has echoes in various religions and philosophies as well. Morality is superrational.

Note however that this strategy depends heavily on repeated interactions between players. If one player is in such a dominant position as to be able to kill the other player by defecting, the strategy is less effective. In practice, Tit-For-Tat works best against close to equally powerful individuals, or when those individuals are part of groups that can retaliate even if the individual dies.

In situations of relative darkness, when people or groups are alone and vulnerable to predators killing in secret, the cooperative strategies are weaker than the more competitive strategies. In situations of relative light, when people are strong enough to survive a first strike, or there are others able to see such first strikes and retaliate accordingly, the cooperative strategies win out.

Thus, early history, with its isolated pockets of humanity facing survival or annihilation on a regular basis, was a period of darkness. As the population grows and becomes more interconnected, the world increasingly transitions into a period of light. The future, with the stars and space where everything is visible to everyone, is dominated by the light.

In the long run, cooperative societies will defeat competitive ones. In the grand scheme of things, Alliances beat Empires. However, in order for this state equilibrium to be reached, certain inevitable but not immediately apparent conditions must first be met. The reason why the world is so messed up, why it seems like competition beats cooperation right now, is that the critical mass required for there to be light has not yet been reached.

We are in the growing pains between stages of history. Darkness was dominant for so long that continues to echo into our present. The Light is nascent. It is beginning to reshape the world. But it is still in the process of emerging from the shadows of the past. But in the long run, the Light will rise and usher in the next age of life.

Perplexity

It is the nature of reality that things are complicated. People are complicated. The things we assume to be true, may or may not be, and an honest person recognizes that the doubts are real. The uncertainty of truth means that no matter how strongly we strive for it, we can very much be wrong about many things. In fact, given that most matters have many possibilities, the base likelihood of getting things right is about 1/N, where N is the number of possibilities that the matter can have. As possibilities increase, our likelihood of being correct diminishes.

Thus, humility as a default position is wise. We are, on average, less than 50% likely to have accurate beliefs about the world. Most of the things we believe at any given time are probably wrong, or at least, not the exact truth. In that sense, Socrates was right.

That being said, it remains important to take reasonable actions given our rational beliefs. It is only by exploring reality and testing our beliefs that we can become more accurate and exceed the base probabilities. This process is difficult and fraught with peril. Our general tendency is to seek to reinforce our biases, rather than to seek truths that challenge them. If we seek to understand, we must be willing to let go of our biases and face difficult realities.

The world is complex. Most people are struggling just to survive. They don’t have the luxury to ask questions about right and wrong. To ask them to see the error of their ways is often tantamount to asking them to starve. The problem is not people themselves, but the system that was formed by history. The system is not a conscious being. It is merely a set of artifices that people built in their desperation to survive in a world largely indifferent to their suffering and happiness. This structure now stands and allows most people to survive, and sometimes to thrive, but it is optimized for basic survival rather than fairness.

A fair world is desirable, but ultimately one that is extraordinarily difficult to create. It’s a mistake to think that people were disingenuous when they tried, in the past, to create a better world for all. It seems they tried and failed, not for lack of intention, but because the challenge is far greater than imagined. Society is a complex thing. People’s motivations are varied and innumerable. Humans make mistakes with the best of intentions.

To move forward requires taking a step in the right direction. But how do we know what direction to take? It is at best an educated guess with our best intuitions and thoughts. But the truth is we can never be certain that what we do is best. The universe is like an imperfect information game. The unknowns prevent us from making the right move all the time in retrospect. We can only choose what seems like the best action at a given moment.

This uncertainty limits the power of all agents in the universe who lack the clarity of omniscience. It is thus, an error to assign God-like powers to an AGI for instance. But more importantly, it means that we should be cautious of our own confidence. What we know is very little. Anyone who says otherwise should be suspect.

The True Nature of Reality

It’s something we tend to grow up always assuming is real. This reality, this universe that we see and hear around us, is always with us, ever present. But sometimes there are doubts.

There’s a thing in philosophy called the Simulation Argument. It posits that, given that our descendants will likely develop the technology to simulate reality someday, the odds are quite high that our apparent world is one of these simulations, rather than the original world. It’s a probabilistic argument, based on estimated odds of there being many such simulations.

A long time ago, I had an interesting experience. Back then, as a Christian, I wrestled with my faith and was at times mad at God for the apparent evil in this world. At one point, in a moment of anger, I took a pocket knife and made a gash in a world map on the wall of my bedroom. I then went on a camping trip, and overheard in the news that Russia had invaded Georgia. Upon returning, I found that the gash went straight through the border between Russia and Georgia. I’d made that gash exactly six days before the invasion.

Then there’s the memory I have of a “glitch in the Matrix”, so to speak. Many years ago, I was in a bad place mentally and emotionally, and I tried to open a second floor window to get out of a house that probably would have ended badly, were it not for a momentary change that caused the window, which had a crank to open, to suddenly become a solid frame with no crank or way to open. It happened for a split second. Just long enough for me to panic and throw my body against the frame, making such a racket as to attract the attention of someone who could stop me and calm me down.

I still remember this incident. At the time I thought it was some intervention by God or time travellers/aliens/simulators or some other benevolent higher power. Obviously I have nothing except my memory of this. There’s no real reason for you to believe my testimony. But it’s one reason among many why I believe the world is not as it seems.

Consider for a moment the case of the total solar eclipse. It’s a convenient thing to have occur, because it allowed Einstein to prove his Theory of Relativity in 1919 by looking at the gravitational lensing effect of the sun that is only visible during an eclipse. But total solar eclipses don’t have to be. They only happen because the sun is approximately 400 times the size and 400 times the distance from the Earth as the moon is. They are exactly the right ratio of size and distance for total solar eclipses to occur. Furthermore, due to gradual changes in orbit, this coincidence is only present for a cosmologically short time frame of a few hundred million years that happens to coincide with the development of human civilization.

Note that this coincidence is immune to the Anthropic Principle because it is not essential to human existence. It is merely a useful coincidence.

Another fun coincidence is the names of the arctic and antarctic. The arctic is named after the bear constellations of Ursa Major and Minor, which can be seen only from the northern hemisphere. Antarctic literally means opposite of arctic. Coincidentally, polar bears can be found in the arctic, but no species of bear is found in the antarctic.

There are probably many more interesting coincidences like this, little Easter eggs that have been left for us to notice.

The true nature of our reality is probably something beyond our comprehension. There are hints at it however, that make me wonder about the implications. So, I advise you to keep an open mind about the possible.

Energy Efficiency Trends in Computation and Long-Term Implications

Note: The following is a blog post I wrote as part of a paid written work trial with Epoch. For probably obvious reasons, I didn’t end up getting the job, but they said it was okay to publish this.

Historically, one of the major reasons machine learning was able to take off in the past decade was the utilization of Graphical Processing Units (GPUs) to accelerate the process of training and inference dramatically.  In particular, Nvidia GPUs have been at the forefront of this trend, as most deep learning libraries such as Tensorflow and PyTorch initially relied quite heavily on implementations that made use of the CUDA framework.   The strength of the CUDA ecosystem remains strong, such that Nvidia commands an 80% market share of data center GPUs according to a report by Omdia (https://omdia.tech.informa.com/pr/2021-aug/nvidia-maintains-dominant-position-in-2020-market-for-ai-processors-for-cloud-and-data-center).

Given the importance of hardware acceleration in the timely training and inference of machine learning models, it might be naively seem useful to look at the raw computing power of these devices in terms of FLOPS.  However, due to the massively parallel nature of modern deep learning algorithms, it should be noted that it is relatively trivial to scale up model processing by simply adding additional devices, taking advantage of both data and model parallelism.  Thus, raw computing power isn’t really a proper limit to consider.

What’s more appropriate is to instead look at the energy efficiency of these devices in terms of performance per watt.  In the long run, energy constraints have the potential to be a bottleneck, as power generation requires substantial capital investment.  Notably, data centers currently use up about 2% of the U.S. power generation capacity (https://www.energy.gov/eere/buildings/data-centers-and-servers).

For the purposes of simplifying data collection and as a nod to the dominance of Nvidia, let’s look at the energy efficiency trends in Nvidia Tesla GPUs over the past decade.  Tesla GPUs are chosen because Nvidia has a policy of not selling their other consumer grade GPUs for data center use.

The data for the following was collected from Wikipedia’s page on Nvidia GPUs (https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units), which summarizes information that is publicly available from Nvidia’s product datasheets on their website.  A floating point precision of 32-bits (single precision) is used for determining which FLOPS figures to use.

A more thorough analysis would probably also look at Google TPUs and AMDs lineup of GPUs, as well as Nvidia’s consumer grade GPUs.  The analysis provided here can be seen more as a snapshot of the typical GPU most commonly used in today’s data centers.

Figure 1:  The performance per watt of Nvidia Tesla GPUs from 2011 to 2022, in GigaFLOPS per Watt.

Notably the trend is positive.  While wattages of individual cards have increased slightly over time, the performance has increased faster.  Interestingly, the efficiency of these cards exceeds the efficiency of the most energy efficient supercomputers as seen in the Green500 for the same year (https://www.top500.org/lists/green500/).

An important consideration in all this is that energy efficiency is believed to have a possible hard physical limit, known as the Laudauer Limit (https://en.wikipedia.org/wiki/Landauer%27s_principle), which is dependent on the nature of entropy and information processing.  Although, efforts have been made to develop reversible computation that could, in theory, get around this limit, it is not clear that such technology will ever actually be practical as all proposed forms seem to trade off this energy savings with substantial costs in space and time complexity (https://arxiv.org/abs/1708.08480).

Space complexity costs additional memory storage and time complexity requires additional operations to perform the same effective calculation.  Both in practice translate into energy costs, whether it be the matter required to store the additional data, or the opportunity cost in terms of wasted operations.

More generally, it can be argued that useful information processing is efficient because it compresses information, extracting signal from noise, and filtering away irrelevant data.  Neural networks for instance, rely on neural units that take in many inputs and generate a single output value that is propagated forward.  This efficient aggregation of information is what makes neural networks powerful.  Reversible computation in some sense reverses this efficiency, making its practicality, questionable.

Thus, it is perhaps useful to know how close we are to approaching the Laudauer Limit with our existing technology, and when to expect to reach it.  The Laudauer Limit works out to 87 TeraFLOPS per watt assuming 32-bit floating point precision at room temperature.

Previous research to that end has proposed Koomey’s Law (https://en.wikipedia.org/wiki/Koomey%27s_law), which began as an expected doubling of energy efficiency every 1.57 years, but has since been revised down to once every 2.6 years.  Figure 1 suggests that for Nvidia Tesla GPUs, it’s even slower.

Another interesting reason why energy efficiency may be relevant has to do with the real world benchmark of the human brain, which is believed to have evolved with energy efficiency as a critical constraint.  Although the human brain is obviously not designed for general computation, we are able to roughly estimate the number of computations that the brain performs, and its related energy efficiency.  Although the error bars on this calculation are significant, the human brain is estimated to perform at about 1 PetaFLOPS while using only 20 watts (https://www.openphilanthropy.org/research/new-report-on-how-much-computational-power-it-takes-to-match-the-human-brain/).  This works out to approximately 50 TeraFLOPS per watt.  This makes the human brain less powerful strictly speaking than our most powerful supercomputers, but more energy efficient than them by a significant margin.

Note that this is actually within an order of magnitude of the Laudauer Limit.  Note also that the human brain is also roughly two and a half orders of magnitude more efficient than the most efficient Nvidia Tesla GPUs as of 2022.

On a grander scope, the question of energy efficiency is also relevant to the question of the ideal long term future.  There is a scenario in Utilitarian moral philosophy known as the Utilitronium Shockwave, where the universe is hypothetically converted into the most dense possible computational matter and happiness emulations are run on this hardware to maximize happiness theoretically.  This scenario is occasionally conjured up as a challenge against Utilitarian moral philosophy, but it would look very different if the most computationally efficient form of matter already existed in the form of the human brain.  In such a case, the ideal future would correspond with an extraordinarily vast number of humans living excellent lives.  Thus, if the human brain is in effect at the Laudauer Limit in terms of energy efficiency, and the Laudauer Limit holds against efforts towards reversible computing, we can argue in favour of this desirable human filled future.

In reality, due to entropy, it is energy that ultimately constrains the number of sentient entities that can populate the universe, rather than space, which is much more vast and largely empty.  So, energy efficiency would logically be much more critical than density of matter.

This also has implications for population ethics.  Assuming that entropy cannot be reversed, and the cost of living and existing requires converting some amount of usable energy into entropy, then there is a hard limit on the number of human beings that can be born into the universe.  Thus, more people born at this particular moment in time implies an equivalent reduction of possible people in the future.  This creates a tradeoff.  People born in the present have potentially vast value in terms of influencing the future, but they will likely live worse lives than those who are born into that probably better future.

Interesting philosophical implications aside, the shrinking gap between GPU efficiency and the human brain sets a potential timeline.  Once this gap in efficiency is bridged, it theoretically makes computers as energy efficient as human brains, and it should be possible at that point to emulate a human mind on hardware such that you could essentially have a synthetic human that is as economical as a biological human.  This is comparable to the Ems that the economist Robin Hanson describes in his book, The Age of EM.  The possibility of duplicating copies of human minds comes with its own economic and social considerations.

So, how long away is this point?  Given the trend observed with GPU efficiency growth, it looks like a doubling occurs about every three years.  Thus, one can expect an order of magnitude improvement in about thirty years, and two and a half orders of magnitude in seventy-five years.  As mentioned, two and a half orders of magnitude is the current distance from existing GPUs and the human brain.  Thus, we can roughly anticipate this to be around 2100.  We can also expect to reach the Laudauer Limit shortly thereafter.

Most AI safety timelines are much sooner than this however, so it is likely that we will have to deal with aligning AGI before the potential boost that could come from having synthetic human minds or the potential barrier of the Laudauer Limit slowing down AI capabilities development.

In terms of future research considerations, a logical next step would be to look at how quickly the overall power consumption of data centers is increasing and also the current growth rates of electricity production to see to what extent they are sustainable and whether improvements to energy efficiency will be outpaced by demand.  If so, that could act to slow the pace of machine learning research that relies on very large models trained on massive amounts of compute.  This is in addition to other potential limits, such as the rate of data generation for large language models, which depend on massive datasets of essentially the entire Internet at this point.

The nature of current modern computation is that it is not free.  It requires available energy to be expended and converted to entropy.  Barring radical new innovations like practical reversible computers, this has the potential to be a long-term limiting factor in the advancement of machine learning technologies that rely heavily on parallel processing accelerators like GPUs.

Page 2 of 5

Powered by WordPress & Theme by Anders Norén