[翻譯] 痛苦的教訓 The Bitter Lesson

17 min readDec 18, 2019

最近同事轉貼了一篇先前在ML Subreddit熱烈討論的文章，在探討人工智慧的發展，跟我的想法類似（不過他講的清楚多了），這邊試著幫大家翻譯一下，其中參雜我的註解，有錯誤請指正。

原文出處：http://incompleteideas.net/IncIdeas/BitterLesson.html

原作者：Rich Sutton (DeepMind)

The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore’s law, or rather its generalization of continued exponentially falling cost per unit of computation. Most AI research has been conducted as if the computation available to the agent were constant (in which case leveraging human knowledge would be one of the only ways to improve performance) but, over a slightly longer time than a typical research project, massively more computation inevitably becomes available. Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation. These two need not run counter to each other, but in practice they tend to. Time spent on one is time not spent on the other. There are psychological commitments to investment in one approach or the other. And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation. There were many examples of AI researchers’ belated learning of this bitter lesson, and it is instructive to review some of the most prominent.

從70年的AI研究中我們學到最重要的一課：利用計算資源以通用方法來解決問題的人工智慧還是最有效的（註：相比用人類知識建構的特定方法），而且效果顯著。

這樣的現象可以歸因於摩爾定律，或者說是單位計算成本持續以指數下降。大多數AI研究的都假設可以使用的計算資源是固定的（在這種情況下，利用人類知識設計模型是改善結果的唯一方法），但是在比一般研究稍長的時間內，就有更多計算資源可以用了。為了在短期內有所作為，科學家往往設計能利用該領域人類知識的方法；但從長遠來看，唯一重要的是利用計算資源。這兩個方向看似沒有矛盾，但實際上它們是傾向於互相抵觸的：在一個方向上耗費的時間就是在另一個方向上損失的時間。兩個方向都是需要投資心力的。而且，利用人類知識的方法往往使方法變得更加複雜，而更不適合套用那些利用計算資源的通用方法。

很多AI科學家很晚才學到這辛酸的一課，讓我們來回顧一些最知名的例子。

In computer chess, the methods that defeated the world champion, Kasparov, in 1997, were based on massive, deep search. At the time, this was looked upon with dismay by the majority of computer-chess researchers who had pursued methods that leveraged human understanding of the special structure of chess. When a simpler, search-based approach with special hardware and software proved vastly more effective, these human-knowledge-based chess researchers were not good losers. They said that ``brute force” search may have won this time, but it was not a general strategy, and anyway it was not how people played chess. These researchers wanted methods based on human input to win and were disappointed when they did not.

從電腦西洋棋的歷史來看，1997年擊敗世界冠軍卡斯帕羅夫的算法，是基於大量的深度搜索。

在當時，這個結果是很令人沮喪的，因為主流的科學家大多根據人類對西洋棋特殊結構的理解來設計算法。當一種更簡單，基於搜索的方法，加上特殊的軟硬體被證明更加有效時，這些電腦西洋棋科學家們不太能接受這個結果。他們說這次「暴力」搜索可能是贏了，但這不會是一個普遍的策略，畢竟這不是人們下棋的方式。這些科學家希望能讓電腦獲勝的算法是基於人類知識，而最後的算法讓他們失望了。

A similar pattern of research progress was seen in computer Go, only delayed by a further 20 years. Enormous initial efforts went into avoiding search by taking advantage of human knowledge, or of the special features of the game, but all those efforts proved irrelevant, or worse, once search was applied effectively at scale. Also important was the use of learning by self play to learn a value function (as it was in many other games and even in chess, although learning did not play a big role in the 1997 program that first beat a world champion). Learning by self play, and learning in general, is like search in that it enables massive computation to be brought to bear. Search and learning are the two most important classes of techniques for utilizing massive amounts of computation in AI research. In computer Go, as in computer chess, researchers’ initial effort was directed towards utilizing human understanding (so that less search was needed) and only much later was much greater success had by embracing search and learning.

在電腦圍棋中也見證了類似的模式，只是又晚了20年。

最初的各種努力都是通過利用人類知識或下棋的特殊技巧來避免搜索，但是一旦能夠進行大規模有效的搜索，這些努力就顯得無關緊要，甚至更糟。而另一個和搜索同樣重要的是能利用跟自己下棋來學習估值函數（這也有用在許多其他遊戲中，甚至是西洋棋，不過在1997年首次擊敗世界冠軍的程式裡它沒有發揮重要作用）。透過自學進行學習，以及一般的機器學習，就跟搜索一樣，都能利用大量的計算資源。搜索和學習是在AI研究中能利用大量計算資源的最重要的兩類技術。在電腦圍棋中，如同在電腦西洋棋中，研究員最初設計的方法都是是基於人類知識（比較不需要搜索），很久之後才採納搜索及學習從而獲得巨大的成功。

In speech recognition, there was an early competition, sponsored by DARPA, in the 1970s. Entrants included a host of special methods that took advantage of human knowledge — -knowledge of words, of phonemes, of the human vocal tract, etc. On the other side were newer methods that were more statistical in nature and did much more computation, based on hidden Markov models (HMMs). Again, the statistical methods won out over the human-knowledge-based methods. This led to a major change in all of natural language processing, gradually over decades, where statistics and computation came to dominate the field. The recent rise of deep learning in speech recognition is the most recent step in this consistent direction. Deep learning methods rely even less on human knowledge, and use even more computation, together with learning on huge training sets, to produce dramatically better speech recognition systems. As in the games, researchers always tried to make systems that worked the way the researchers thought their own minds worked — -they tried to put that knowledge in their systems — -but it proved ultimately counterproductive, and a colossal waste of researcher’s time, when, through Moore’s law, massive computation became available and a means was found to put it to good use.

在語音識別方面，在1970年代由DARPA發起了一項比賽。參賽者包括許多利用人類知識的特殊方法——單詞，音素，人聲道等知識；另一方面，也有一些較新的統計方法，依賴更多的計算，基於隱馬爾可夫模型（HMM）。結果同樣地，基於統計的方法勝過了基於人類知識的方法。

這造成了所有自然語言處理的重大變化，幾十年來，統計學和計算逐漸成為該領域的主流。近年興起的深度學習，在語音識別領域，正是朝著這個方向的前進。深度學習方法甚至不依賴人類知識，而是使用更多的計算資源，以大量訓練集上的學習，產生更好的語音識別系統。就像電腦下棋的例子，科學家總是試圖讓系統以他們認為自己的思想起作用的方式來工作——他們試圖將這些知識納入系統中——但隨著摩爾定律，有了更多計算資源，並且找到方法可以充分利用這些計算資源的時候，這些知識系統就被證明是適得其反的，只是在浪費研究人員的時間。

In computer vision, there has been a similar pattern. Early methods conceived of vision as searching for edges, or generalized cylinders, or in terms of SIFT features. But today all this is discarded. Modern deep-learning neural networks use only the notions of convolution and certain kinds of invariances, and perform much better.

在電腦視覺方面，也有類似的故事。早期的電腦視覺方法使用邊緣檢測、圓柱體泛化或者SIFT特徵，但是這些方法現在都沒有人使用了。現今深度學習神經網絡僅使用卷積和某些不變性的概念，並且表現的更好。

This is a big lesson. As a field, we still have not thoroughly learned it, as we are continuing to make the same kind of mistakes. To see this, and to effectively resist it, we have to understand the appeal of these mistakes. We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.

我們必須把這些故事作為前車之鑑。我們在這個領域繼續犯同樣的錯誤，代表我們還沒有完全了解它。要看清楚這些錯誤，並避免重道覆轍，我們必須了解這些錯誤的根源。我們必須記取這慘痛的教訓：長期來看，建立在我們認為自己思考方式的方法是行不通的。這慘痛的教訓是基於以下列歷史觀察：1）人工智慧科學家們經常試圖以他們的知識設計智能體（agent)；而2）這在短期內總是有幫助的，並且會帶來研究的成就感，但3）從長遠來看，這會減緩甚至妨礙AI發展，4）突破性的進展反而是通過基於更多搜索和學習的方法而實現的。最終的成功往往充滿了辛酸，令人難以接受，因為它們並不是建立在以人類知識為中心的方法上。

One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.

從痛苦的教訓中，我們應該學到的第一點是：通用方法的強大，還會隨著可用計算資源的增長變得很加強大。而就目前來看，會隨著可用計算資源增長的兩種方法，就是搜索和學習。

The second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries. All these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity. Essential to these methods is that they can find good approximations, but the search for them should be by our methods, not by us. We want AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done.

從痛苦的教訓中該學到的第二點是，人類心智複雜的程度是我們難以想像的；我們應該停止尋找能簡單解釋它的概念，像是那些我們看待空間、物體、對稱性等等的概念，都是只是這本質上複雜的世界的表面。它們不應該被假設是內建的，因為它們的複雜度是無限的；反過來說，我們只應該建立可以發現和描繪這種任意複雜度的元方法（meta-method）。這些方法的特點是可以找到良好的近似值，不過這些結果應該是由我們建構的方法來搜索，而不是我們人類自己搜索。我們希望AI智能體能「像我們一樣發現」，而不只是「包含我們發現的東西」。

心得

翻譯不是件容易的事，除了必須相當了解原文內容外，還要用易懂的中文重新描述。有時候直接逐字翻成中文，會變成不通順或很難理解的句子。Google翻譯做得比之前好很多了，感覺上有在句子的層級翻譯，而不是逐字對應成中文，讀起來通順許多，不過還是有很多錯誤跟冗贅的地方。

建構Strong Artificial Intelligence是我從小以來的夢想，我對心智跟電腦一直都很有興趣。不過在探索的過程中，我很早就體會到，我們離這個夢想還很遠。近年來 AI蓬勃發展，都只是利用統計、搜索、學習的方法，讓電腦在各種任務（物體辨識、人臉辨識、語音識別、問答、自動駕駛等等）上做出接近人類表現的行為而已，還沒有什麼可以稱得上是思考或心智的東西。不過我認為，隨著計算資源的增長以及機器學習方法的進步，電腦在各種任務上會逐漸逼近人類，總有一天這些弱人工智慧的集合，可以變成一個在各方面都能通過圖靈測試的智能體，或者變成一個有思考能力的AI，或者我們會發現這兩個其實是一樣的，畢竟用前額葉邏輯思考跟視覺一樣，都只是一些神經元作用的結果而已。

[翻譯] 痛苦的教訓 The Bitter Lesson

心得

Written by Ya-Liang Allen Chang