A collective of startups known as the Endowment for Climate Intelligence (ECI) has unveiled the first-ever open-source ensemble of AI models dedicated to addressing the impacts of climate change.
Large language models (LLMs) such as ChatGPT and Google’s Bard have faced concerns over the accuracy of the information they provide, as they are trained on data scraped from the internet that is not rigorously fact-checked or referenced.
In contrast, ECI claims that ClimateGPT aims to create “trust and transparency” by drawing on scientific data using a robust model that authenticates, secures, and governs the information provided. The model is integrated with Hedera, a public blockchain, which ensures the “highest standards of data integrity,” said the ECI.
Toward open-source
ECI was formed by three startups looking to make reliable climate science information quickly and freely available to researchers, policymakers, and business leaders.
Jonathan Dotan, founder of EQTY Lab, and a founding member of ECI, believes that ClimateGPT is part of a broader industry shift toward open-source AI.
“We saw the release of countless AI models in 2023, but many still exist in walled gardens, which limits their potential for positive change,” he told TNW. “This year I believe we will see demand for AI that is more open, secure, and accountable.”
The data
Rather than one chatbot, ClimateGPT is an ensemble of three task-specific LLMs, that vary in terms of the number of parameters they were trained on — which also effects accuracy. For example, the 7B model is far less accurate than the 70B version. ‘B’ stands for ‘billion’ and indicates the number of data points the model was trained on. All three models are fine-tuned versions of Meta’s Llama-2 large language model.
ClimateGPT was trained on data from the Dutch startup Erasmus.AI’s planetary scale corpora. The Erasmus corpus is drawn from over 10 billion web pages and millions of open-access academic articles from across the natural, social, and economic sciences. ClimateGPT’s skills were also honed on data from the Club of Rome’s Earth4All climate science initiative and the UN Sustainable Development Goals (SDGs).
In addition, Switzerland-based startup EQTY Lab led the integration of a new advanced cryptographic framework that authenticates, secures, and governs the model. While AppTek, a startup specialising in machine learning, fine-tuned the ClimateGPT through a series of tests and benchmarks.
Countering misinformation
The chatbot synthesises interdisciplinary research and breaks silos to form a holistic understanding of the impacts of climate change.
Thanks to software from AppTek, ClimateGPT is available in 20 languages. Its creators also say that it was trained and is hosted entirely on renewable energy.
What’s more, ClimateGPT has passed a number of tests and controls ensuring that it does not spread climate misinformation.
A quick test
The first version of ClimateGPT, which is completely open-source, has been released on the Hugging Face community platform. You can also request access to it on the ECI website.
In a quick test, we asked the chatbot to answer a basic question related to a recent article we wrote on the problems of bioenergy. Its answer was detailed, thorough, and hit on many of the points addressed in our article. Importantly, it was backed up by academic sources. We also liked that the LLM gave specific examples to go along with its explanations, something tools like ChatGPT or Google Bard rarely do.
Take a look for yourself:
Here’s how those answers compare to the same prompt in ChatGPT:
According to ECI, ClimateGPT has been shown to be 10 times more accurate at completing climate-specific tasks when compared to Meta’s base Llama-2 model.
On Hugging Face, users can now download the model, its research paper, and use a new AI lineage explorer to get visibility into the ClimateGPT training lifecycle. This is part of ECI’s mission to make the development of AI as transparent and accessible as possible.
“The need for a new generation of responsible and sustainable AI strategies to address global challenges has never been greater,” said Ariana Fowler, head of research at EQTY Lab.
As ClimateGPT scales up and gets fined-tuned, it has the potential to help researchers, policymakers, and business leaders make informed decisions at a time where climate action is needed more than ever before.
Disclaimer (10: 30AM CET, January 25, 2024): The comparison above was used using the full 70B version of ClimateGPT. In a previously published of this article we only used the 7B version which led to an less than satisfactory response to our prompt.