Since last fall, when the groundbreaking artificial intelligence called ChatGPT was introduced, a question has been taking shape in the back of my mind. ChatGPT's feats are impressive. It can pass high school Advanced Placement tests. It outperforms most humans on the bar exam attorneys take to practice law. It writes essays, software, resumes, emails, and songs. What if, I wondered, there was something that understood all the research and analysis and insight ever published on wooden planes? What would I ask it? So I created a chatbot powered by the ChatGPT AI and trained it on research published in The Chronicle, the U.S.'s leading journal on antique tools.
Screenshots of some of the questions I posed to the chatbot can be found at the end of this newsletter.
ChatGPT uses what's called generative AI. That means it creates new and original text based on examples of what it’s been previously shown. The AI was given billions of pages of data scraped from the internet and then taught to recognize patterns of words that are likely to appear together. As the AI repeatedly trained on that data, it learned to recognise increasingly complex patterns, picking up countless fragments of information along the way. It is very, very good at generating human-like responses based on that pattern recognition.
The AI that powers ChatGPT is available to software developers as a stand-alone tool, which means I was able to write a simple but powerful chatbot with my own training data. Think of it like a car: The body of the car and the steering wheel are the chatbot program I wrote. The engine is the ChatGPT artificial intelligence. The fuel is a sequential series of 12 stories about wooden planes that ran in The Chronicle between 2017 and 2021.1
Why so few articles? While ChatGPT normally has terabytes of data to work with, the amount of data you can use for your own chatbot is limited to the equivalent of about 25,000 words. This is nowhere near my grand vision. But the stories are a representative cross-section of the type of research that underpins our collective knowledge of wooden planes, which makes it ideal training data.
The chatbot predictably excelled at writing high-level summaries: the relationship between cabinetmakers and planemakers, factors planemakers had to consider when making planes, chamfers on 18th century planes, whether spring lines were used on bench planes, and what layout lines are. There were articles about each of those topics (most of The Chronicle stories were about 18th century planemaking) in the training data. The chatbot was also good at "feelings" questions: what makes people passionate, or what is interesting, or what is exciting about researching planes. The answers were good, sometimes outstanding, but like most ChatGPT summaries they had a generic, formulaic quality, like a bored college freshman wrote them.
What really impressed me is when I asked it how to tell the difference between planes made by Samuel Dean and Abishai Woodward. Both of those planemakers were the subject of separate, detail-rich articles that topped 3,000 and 1,600 words respectively. The chatbot was able to navigate an intricate tapestry of dates, locations, and names and come to the conclusion that the simplest way was to look at their maker marks. This was approaching what I had initially hoped for, not quite my dream of the ultimate research genie, but it was still an AI that could find answers within complex data. My excitement was short-lived.
For every interesting answer, there was another that was wrong. Sometimes it was a tiny detail. In this Timothy Spencer summary, the second plane was not marked PS; that mark was found on the iron. (The rest of the answer is essentially a direct copy of the original text.) The chatbot wasn't able to keep the various Samuel Deans in the training data seperate. Some of its answers weren’t wrong, they were just overly specific due to the limits of its training data, like its answer about events that happened in Wrentham, Massachusetts.
AI researchers call plausible sounding incorrect or nonsensical answers hallucinations. Again and again, the chatbot authoritatively gave wrong answers. The original story on planemaker Samuel Doggett lists many dates in his life, but the chatbot couldn't keep track. Asking it about a Sleeper-style plane, instead of a Sleeper-style wedge (which is was John Sleeper was known for), resulted in gibberish. An explanation of how to make an 18th century plane turned into word salad.
There were several people who might be called the most famous 18th century planemaker, but the obscure Oliver Spicer was definitely not one of them. These answers were not due to the small amount of training data. There are no stories that reference fame or who invented the plane.
ChatGPT will generate answers with no basis in fact, but which still fit what it was asked to produce. "AI algorithms can be trained to generate false information that is difficult to distinguish from real facts," the chatbot wrote when I asked if it would ever give misleading information. But it doesn't even need bad training. There is no "Abraham Hyatt" in the Chronicle stories, and it still wrote a plausible biography. The sources are real; the citations are fake.
The ChatGPT AI is not sentient. It doesn't "know" that its output is factually right or wrong. I asked it write a biography of a planemaker, and so it used the patterns it found for a planemaker's life in its training data to do just that. As Anna Wiener wrote recently, "What’s happening is data exchange between user and bot — but it is also a mutual manipulation, a flywheel, an ouroboros."
I worked for many years as a journalist, writing and editing what we could call containers: articles, stories, columns, reviews. These are distinctly human ways of packaging information. The last news organization I worked at was an app called Circa. Instead of writing containers, we split breaking news into what we called "atoms" — individual facts, events, quotes, ideas — and used these atoms to present the news in an entirely new way. It was groundbreaking (Circa was Apple and Google's news app of the year), but unfortunately way ahead of its time.
Generative AI has made atoms relevant again, this time on a massive scale. Enormous amounts of online information are being atomized. Containers are becoming irrelevant. The Early American Industries Association will never stop publishing The Chronicle. But the EAIA, Mid-West Tool Collectors Association, Tools and Trades History Society, and similar organizations that publish tool research are faced with a fundamental question: Is the container or the atoms in those containers more important? Will these organizations leave their vast institutional knowledge locked in PDF files that only their members will ever see? Or do they want to use new tools — ones that respect the monetary and institutional value of their research — to find new audiences?
Generative AI is only going to become more powerful. Its main strength isn't writing songs or cover letters, it's finding connections in huge data sets. But that's only possible because of the work done by us: the people who researched these stories, who combed through the source material, who wrote in their own unique, creative voice. AI makes high-probability choices when it looks for patterns. But as the writer William Deresiewicz put it, creative people do the opposite: "That is what originality is, by definition: a low-probability choice, a choice that has never been made." High and low, AI and researcher. This is our cyborg — truly half human, half machine — future.
— Abraham
The chatbot is obviously not available to the public, but not just because it's unreliable. OpenAI, which developed ChatGPT, and other companies like Meta, Google, and Microsoft are treating the original content they scrape for their AIs like it's magically free, ignoring the human work that went into creating it. When it comes to a chatbot like mine, the organizations who hold the rights to this material must have a say in whether they want their work to be free to the public. By limiting the chatbot to the OpenAI API and locally stored files, I ensured the materials were kept private.2
Q and A with the Wooden Planes Chatbot
Can artificial intelligence add to our knowledge of planemaking and planemakers?
Could Al be used to create misleading facts about planemakers?
Could artificial intelligence harm wooden plane research?
Describe the chamfers on 18th century planes
Explain how to make an 18th century-style plane
Explain what layout lines are and how planemakers used them
How did planemaking evolve as the United States grew?
How do I tell the difference between a plane made by Samuel Dean and Abishai Woodward?
List five events that happened in Wrentham Massachusetts
Were spring lines used on bench planes?
What do you think of 18th century planes and planemaking?
What is the best way to describe planemaking?
What is the best way to describe wooden planes to a non-technical audience?
What is the most exciting part of researching wooden planes?
What is the most interesting part of researching wooden planes?
What is the wrong way to think about wooden planes?
What makes a Sleeper-style plane distinctive?
What was the relationship between cabinetmakers and planemakers?
What were the most important dates in Samuel Doggett's life?
Who invented the wooden plane?
Who was the most famous 18th century planemaker?
Why are antique wooden planes so expensive?
Why do people feel passionate about wooden planes?
Why is collecting wooden planes important?
Why was Wrentham Massachusetts one of the major centers of colonial planemaking?
Write a biography of a planemaker named Abraham Hyatt
Write a biography of Timothy Spencer
Write a comprehensive analysis of the main trends in 18th century planemaking
Write a list of the most important factors that 18th century planemakers had to consider
I removed tables, photo captions, bibliographical notes, and anything else that wasn’t part of the written portion of the articles.
From OpenAI's documentation: "OpenAI does not use data submitted to and generated by our API to train OpenAI models or improve OpenAI’s service offering."
Great post Abraham. A while back I attempted to use GPT to generate tips and tricks for woodworking ... it did not go well.
It seems LLMs really start to crack when describing the physical world without sufficient training data, at which point the hallucinations become humorous.
(Comment content of soda can tips and tricks)
https://christopherschwarz.substack.com/p/tricks-of-the-trade-a-confessional/comment/12589204?r=1yyk8u