Meta's new Llama 4 AI models aren't open source -- despite what Mark Zuckerberg says
Plus: 20 years of Git; and a new job board for open source job-seekers
In issue #12 of Forkable, we look at the launch of Meta’s new “open” AI model, which has stirred controversy in the AI and open source communities.
Elsewhere, open source version control system Git turns 20; the CNCF launches a new open source job board; and AI crawlers are messing with Wikimedia’s bandwidth.
Oh, and I’ve started interviews for my upcoming COSS startup profiles, so I hope to begin publishing in the coming weeks.
Have a good weekend when it comes folks, and as usual feel free to reach out to me with any questions, tips, or suggestions: forkable[at]pm.me.
Paul
Open issue
‘Openness is binary’
Meta launched the latest versions of its Llama-branded AI model last week, and the company was immediately accused of submitting a special customized model to AI benchmarking site LM Arena — a submission that may have concealed inherent weaknesses compared to the publicly available modeal (which Meta denies).
However, the main issue for those in the open source community continues to be Meta playing loose-and-fast not with benchmarks, but with terminology.
Previously, Meta has unambiguously referred to its Llama-brand of AI models as “open source.” As such, the Open Source Initiative (OSI), stewards of the so-called “open source definition” and “open source AI definition,” has repeatedly called Meta out on its claims.
Why? Well, because language matters.
Llama 4 still restrics access to source code and training data. And digging into the Llama 4 license reveals that it still contains other notable restrictions which would fail even the most liberal interpretation of “open source,” including geographical blocks. As per the Llama 4 acceptable use policy, the rights granted by the Llama 4 Community License Agreement do not apply:
“…if you are an individual domiciled in, or a company with a principal place of business in, the European Union.”
With Llama 4, however, Meta is a little more careful with the language in its official announcement. While it does say things like “in keeping with our commitment to open source,” it only refers to the actual Llama 4 models themselves as “open-weight.” This is correct, as the company does indeed make its parameters — which determine how the model processes input data and makes predictions — publicly accessible, allowing researchers to fine-tune things.
But in an Instagram post announcing the new models, Mark Zuckerberg continues to call them open source. “Today, we are dropping the first open source Llama 4 models, and we have two more on the way,” Zuckerberg said.
So, why is Meta continuing to muddy the waters around the open source AI definition, using “open weight” and “open source” as though the terms are interchangeable?
On the one hand, Meta likely knows that its Llama models can’t really be called open source, which it tacitly acknowledges by calling Llama 4 “open weight” in its official announcement. But at the same time, it wants all the kudos that comes with open source — everyone loves open source, right?
However, there are likely more specific reasons for this “open washing,” including an effort to curry favour with regulators. The EU AI Act, for example, has special exemptions for open source AI models. Although Meta hasn’t yet made Llama available to EU entities, it will want to in the future — so this could be more of a preemptive move to align itself with the open source exemptions contained within the Act.
With Llama 4 out now, the OSI took to LinkedIn to make its feelings known: “Llama 4 is still not #opensource and Europeans are excluded. Stop calling it Open Source AI.”
In a follow-up post, the OSI’s head of community, Nick Vidal, took a thinly-veiled jab at Meta too, pushing back against the notion that “openness is a spectrum,” or that there is some sort of “sliding scale” to open source AI.
“The concept of freedom lies at the heart of Open Source — and freedom, by its very nature, is binary. You either have it, or you don’t,” Vidal notes. “Software that carries restrictions on usage, modification, redistribution, or access to its building blocks (like data or model weights) isn’t partially open. It’s not Open Source at all. Those conditions — ethical restrictions, commercial limitations, time-delayed releases — are forms of restriction, not openness.”
Read more: There are no ‘Degrees of Open’: why Openness is binary
The rundown
20 years of Git
Linus Torvalds is best known as creator of the Linux kernel, but the Finnish software engineer also created Git, the open source distributed version control system (VCS).
Torvalds had initially adopted the proprietary BitKeeper VCS at Linux, only for the BitKeeper developer to revoke the license for Linux kernel developers in response to a software developer reverse-engineering BitKeeper. And so Torvalds was forced to develop Git as a replacement.
The first version of Git arrived on April 7, 2005, and the rest, as they say, is history — millions of developers today use Git to track and manage codebase changes on the likes of GitHub, GitLab, and Bitbucket.
20 years on from that momentus day, the folks at GitHub sat down with Torvalds to reflect on the early days of Git, and its evolution through the years. There are lots of great anecdotes and memories in the interview, including this little nugget, which highlights how much of an impact Git has had.
“My oldest daughter went off to college, and two months later, she sends this text to me and says that I’m more well-known at the computer science lab for Git than for Linux because they actually use Git for everything there,” Torvalds said. “And I was like, Git was never a big thing for me. Git was an ‘I need to get this done to do the [Linux] kernel.’”
An open source job board for open source jobs
The Cloud Native Computing Foundation (CNCF) quietly launched a new open source job board at KubeCon in London last week. It’s still in beta, so the listings are fairly sparse, but GitJobs will serve job opportunities that can be filtered by language, technology, and foundation.
AI crawlers hit Wikimedia’s bandwidth
As I wrote a few weeks back, AI crawlers are wrecking the open internet and open source infrastructure, as data-hungry AI firms hoover up content to train their models.
Now, Wikipedia’s umbrella organization, the Wikimedia Foundation, says that bandwidth consumption for multimedia downloads from Wikimedia Commons has surged by 50% over the past year — with scrapers again to blame.
More specifically, Wikimedia says that almost two-thirds of the most expensive, resource-intensive traffic was from bots.
“Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs,” Wikimedia said.
Patch notes
Rackspace, which co-created the open source cloud platform OpenStack along with NASA more than a decade ago, announced Rackspace OpenStack Flex, pitched as a license-free, private cloud alternative to the hyperscalers and proprietary cloud platforms.
Warehouse equipment manufacturer Noblelift announced an open source forklift project, with the Alpha-50v1.0 lithium forklift touted for release in July. There’s scope for a FORKED-lift pun in here somewhere…