Human knowledge will still be needed

Some people think that, with the advent of generative AI, the era of human-generated knowledge is over or nearly over. But I see no reason to believe that. Imagine that we could gather all the knowledge humanity had 150 years ago (circa 1875) and train the best generative AI model on it. Would it be able to invent all that we’ve invented since then?

I find that extremely unlikely.

In 1875, the scientific world already had an impressive intellectual foundation. Maxwell’s equations had just unified electricity and magnetism. Thermodynamics was well established. Darwin’s theory of evolution had transformed biology. The periodic table had been proposed. Steam power ran the industrial economy. Telegraph networks spanned continents. Railroads connected nations. From the perspective of that era, it was not unreasonable to believe that most of the fundamental principles of nature were already known.

Yet much of the modern world depended on conceptual breakthroughs that were simply not even implicit in the knowledge of the time.

Take relativity. In 1875, physicists believed space and time were absolute and that light propagated through a hypothetical “luminiferous ether.” Nothing in the textbooks of the period suggested that time itself could slow down or that space could bend. Einstein’s special relativity (1905) and general relativity (1915) required abandoning deeply entrenched assumptions. It is therefore difficult to see how a system trained only on 19th-century physics—whose sources all assumed an ether and absolute time—would have inferred curved spacetime.

Quantum mechanics is an even clearer example. Around 1875, atoms were still debated and largely treated as theoretical conveniences. Electrons had not yet been discovered. Spectral lines were mysterious empirical facts. The idea that energy is quantized, that particles behave like waves, or that measurement plays a fundamental role in physical systems would have sounded bizarre. Yet the quantum framework developed between 1900 and the 1930s became the basis for nearly all modern electronics.

Consider what followed from those developments: the transistor (1947), integrated circuits, semiconductor lasers, and ultimately the entire digital computing industry. A model trained only on pre-1900 knowledge would not have encountered the solid-state physics required to even frame the problem.

Biology provides similar examples. In 1875, no one knew that DNA carried hereditary information. Mendel’s work on inheritance had been published but was largely ignored and not rediscovered until 1900. The structure of DNA was determined only in 1953. From that discovery grew molecular biology, recombinant DNA technology, modern biotechnology, and genome sequencing. These developments depended not only on prior knowledge but on new experimental tools and new conceptual frameworks.

Medical science tells the same story. Germ theory was only beginning to gain acceptance in the 1870s, and much of what physicians believed to be established medical knowledge at the time was later shown to be mistaken or even dangerous. Many doctors still attributed disease to “miasmas,” or bad air, rather than microorganisms, which led to public health efforts that focused on odors rather than sanitation. Surgical practice was only beginning to adopt antiseptic techniques; until Joseph Lister’s methods slowly gained acceptance, many physicians resisted sterilization, and postoperative infections were often treated as unavoidable. Treatments that were considered standard care could also be harmful. Mercury compounds were widely prescribed for conditions such as syphilis despite their severe toxicity. Bloodletting—based on centuries-old theories about balancing bodily humors—had only recently begun to decline after contributing to countless deaths. Even infant care reflected misconceptions: physicians frequently recommended early artificial feeding with contaminated milk, which contributed to high infant mortality before the development of pasteurization and modern pediatrics. In other words, the medical literature of the era contained not only missing knowledge but entrenched frameworks that actively pointed researchers in the wrong direction.

Even technologies that might appear incremental depended on discoveries that would have been hard to anticipate. Aviation required advances in aerodynamics and control systems that the Wright brothers developed through systematic experimentation in the early 1900s. Nuclear energy required the discovery of radioactivity (1896), the nucleus (1911), and nuclear fission (1938). Spaceflight depended on rocket science, materials science, guidance systems, and decades of engineering iteration.

None of this means a hypothetical 1875 AI would have been useless. It might have helped organize knowledge, summarize scientific debates, or suggest plausible combinations of existing ideas. It might have accelerated incremental improvements—better steam engines, improved metallurgy, or more efficient telegraph networks. But there is a difference between recombining known ideas and discovering phenomena that no one yet understands.

That thought experiment suggests a useful lesson for today. Modern generative AI systems are extraordinarily capable at synthesizing and recombining existing knowledge. They can summarize literature, generate code, assist with writing, and help researchers explore large spaces of possibilities. In many fields, they may substantially accelerate the pace of incremental progress.

But the frontier of knowledge is not just a larger version of the past. Just like 150 years ago, it contains phenomena we have not yet observed, conceptual frameworks we have not yet invented, and experimental results that may contradict our current models. Advancing that frontier still requires curiosity, experimentation, and the willingness to question prevailing assumptions. We should not be so arrogant as to think that we’ve discovered all the key phenomena that exist. Generative AI is therefore best understood as a complement to human knowledge generation, not a replacement for it.