Despite the rapid progress in artificial intelligence, AI is not close to being ready to replace humans in the practice of science. But that doesn’t mean they can’t help automate some of the hard work that comes from everyday scientific experiments. For example, a few years ago, researchers put artificial intelligence in control of automated laboratory equipment and taught it how to comprehensively classify all the interactions that could occur between a set of raw materials.
Although this is useful, it still requires a lot of intervention from researchers to train the system in the first place. A group from Carnegie Mellon University has now figured out how to make an artificial intelligence system teach itself how to handle chemistry. The system requires a set of three AI instances, each specialized in different processes. But, once you set it up and supply it with the raw materials, all you have to do is tell it what type of reaction you want to do, and it will figure it out.
Artificial intelligence trinity
The researchers note that they are interested in understanding the capabilities that large linguistic models (LLMs) can offer to the scientific endeavour. So all the AIs used in this work are LLMs, mostly GPT-3.5 and GPT-4, although others -Claude 1.3 and Falcon-40B-Instruct- have also been tested. (GPT-4 and Claude 1.3 performed best.) But instead of using a single system to handle all aspects of chemistry, the researchers created distinct instances of collaboration in setting up a division of labor that they called the “cosmic world.”
The three systems they used are:
Web searcher. This has two main capabilities. One is to use the Google Search API to find pages that might be worth absorbing because of the information they contain. The second is to ingest those pages and extract information from them – think of this as similar to the context of previous pieces of conversation that Chat GPT can retain to inform its later answers. The researchers were able to track where this unit was spending its time, and about half of the places it visited were Wikipedia pages. The top five sites visited included journals published by the American Chemical Society and the Royal Society of Chemistry.
Documentation researcher. Think of this as rtfm Example. The AI was to be given control of various laboratory automation equipment, such as automated liquid handlers and the like, which would often be controlled either via specialized commands or something like a Python API. This AI instance has been given access to all the manuals for that device, allowing it to learn how to control it.
a plan. The planner can issue commands to both other instances of the AI and process their responses. It has access to a Python code execution sandbox, allowing it to perform calculations. He also has access to automated laboratory equipment, allowing him to conduct and analyze experiments virtually. So you can think of the planner as part of the system that has to act like a chemist, learning from the literature and trying to use the equipment to implement what it has learned.
The planner can also identify when programming errors occur (either in Python scripts or in its attempts to control automated machines), allowing it to correct its errors.
Put the system into use
Initially, the system was asked to synthesize a number of chemicals such as acetaminophen and ibuprofen, confirming that it could generally discover a viable formulation after searching the web and scientific literature. So, the question is whether the system is able to detect the devices it has access to well enough to trigger its conceptual capacity.
To start with something simple, the researchers used a standard sample plate, containing an array of small wells arranged in a rectangular grid. The system was asked to fill in squares, diagonal lines or other patterns using different colored liquids and it was able to do so effectively.
Moving on, they placed three different colored solutions at random locations in the well network; The system was asked to identify wells and what color they were. On its own, Coscientist didn’t know how to do this. But when he was reminded that different colors would show different absorption spectra, he used a spectrometer he had access to and was able to identify the different colors.
With basic command and control seemingly working, the researchers decided to try some chemistry. They fitted the sample plate with wells filled with simple chemicals, catalysts, and the like, and asked it to perform a specific chemical reaction. The cosmologist got the chemistry right from the beginning, but his attempts to get the synthesis to work failed because it sent an invalid command to the machines that heat and drive the reactions. This brought him back to the documentation module, allowing him to correct the problem and run the interactions.
And it worked. Spectral fingerprints of the desired products were present in the reaction mixture, and their presence was confirmed by chromatography.
With the basic reactions working, the researchers then asked the system to optimize the efficiency of the reaction, and they presented the optimization process as a game where the score rose with the outcome of the reaction.
The system made some bad guesses in the first round of test feedback, but quickly focused on achieving better returns. The researchers also found that they could avoid bad choices in the first round by providing the Coscientist with information about the outcomes generated by a few random initial mixtures. This means that no matter where a Coscientist gets its information from – whether from its own feedback or from some external information source – it is able to incorporate the information into its planning.
The researchers concluded that the Coscientist has several notable abilities:
- Planning chemical synthesis using general information
- Navigate and process technical manuals for complex devices
- Use that knowledge to control a range of laboratory equipment
- Integrate these instrumentation capabilities into your laboratory workflow
- Analyze his reactions and use that information to design improved reaction conditions.
In many ways, this sounds like the experience a student might have in their first year of graduate school. Ideally, the graduate student will progress beyond this. But maybe GPT-5 will be able to as well.
Even more dangerous is that the coscientist architecture, which relies on the interaction of a number of specialized systems, is similar to how minds work. Clearly, specialized brain systems are capable of a wide range of activities, and there are many of them. But this kind of structure may be critical to enabling more complex behavior.
However, the researchers themselves are concerned about some of the coscientist’s abilities. There are a lot of chemicals (think things like nerve agents) that we don’t want to see easier to manufacture. Figuring out how to tell GPT instances not to do something has become a constant challenge.
Nature, 2023. DOI: 10.1038/s41586-023-06792-0 (About digital IDs).
“Devoted student. Bacon advocate. Beer scholar. Troublemaker. Falls down a lot. Typical coffee enthusiast.”