Oct 15, 2025 ml-news

Nanochat by Andrej Karpathy - "The best ChatGPT that $100 can buy."

The stock market dips 2% with every tweet certain politicians or CEOs post - the AI/tech space has something very similar.

Every time Andrej Karpathy makes an announcement, it's the hottest news of the day and - according to some - will change the whole field completely. What was it this time? A repository called "nanochat".

On Andrej Karpathy - the "inventor of vibe coding"

2015: Stanford PhD on NLP and Vision in AI under Fei-Fei Li (that name alone could fill a whole post, so many influential people in the AI space worked with her at Stanford)

2015-2017: Founding member of OpenAI (back before it became the company that published ChatGPT), back when it did research and took the "Open" part a bit more literal.

2017-2024: Followed Musk to Tesla and worked as director of AI, then briefly returned to OpenAI during the ChatGPT release period in 2023.

2024: Left the big companies and focused on building his own education-focused company. He has also posted extremely valuable resources on course sites and his YouTube channel for years.

2025: Besides his obvious contributions to the field of AI, he also coined the term "vibe coding" this year. This is important because it shows how much influence he has on AI "pop culture" and hype as well.

Sources: Wikipedia & being online in the year of 2025

Nanochat Announcement and its Aspects

Andrej wrote this announcement on X/Twitter: https://x.com/karpathy/status/1977755427569111362 on October 13, 2025. It's been only a little more than 24 hours and I've already seen so many hot takes on this... sigh, here we go:

1) What nanochat actually is (in one sentence)

nanochat = a tiny, full-stack “ChatGPT-like” teaching repo that trains, evaluates, and serves a small chat model end-to-end on a single 8×H100 node (tokenizer → pretrain → SFT → eval → inference → simple WebUI) in 8,000 lines of code.

Having all this in one repository - vetted by an expert - is quite rare and can be a great learning resource.

2) Why the “$100 ChatGPT” tagline confused everyone

The tagline of the repository is “The best ChatGPT that $100 can buy.” To some, this seems to mean "ChatGPT will soon be replaced by a way cheaper product." This interpretation misses a lot of context, though:

The tweet itself clarifies the trade-offs: with ~$100 (~4 hours on 8×H100) you get a “little ChatGPT clone you can kind of talk to” - not a GPT-5 class system. He also shows sample chats and notes that bigger budgets improve depth. The GitHub discussion also positions the $100 “speedrun” as more of an educational target to make the run accessible to many people.

TL;DR: The tagline is optimized for attention, but the content is for learners. If you read “$100” and expected a production assistant, you will be disappointed.

3) Educational material - not a product

Eighty percent of members of the AI space online seem to want a product - they want to buy (access to) an LLM that they can customize however they want and have it do their bidding. They are users of these AI/LLM products.

The other 20% are developers, students, and researchers looking to understand how LLMs work. This is the intersection of computer science theory and the product side in industry.

This repository is for the second group - not the first.

4) Product reality check

Performance: With ~$100 you’ll get something entertaining but limited. Tuning and further training (more money, much more money) will improve it and is highly encouraged, but it will never* reach frontier model level.

Costs & infra: The $100 just refers to training costs; for deployment as a product you would still need to pay for inference infrastructure and need significant knowledge to keep the pipeline running.

5) Context: “vibe coding,” hype, and the 2025 small-model mood

Vibe coding: Karpathy made the term vibe-coding popular but has been critical of the practice all this time. He has clarified that most of this project is hand-crafted and optimized for reading and learning (in the Twitter replies and on HackerNews). He strongly encourages using AI responsibly for coding and not blindly copying code.
Small vs. big models: Yes, there’s a broader 2025 shift toward specialized/smaller models for efficiency and control - but no, the industry hasn’t abandoned large models. There's just not that much new to discuss about the big models right now, so you see more discussions about smaller projects online.

My takeaways (for readers who asked “does this matter?”)

If you’re learning: This is gold. You can see the full pipeline in one place and tinker end-to-end - not many repos or courses offer this. I also recommend the book Build a Large Language Model (From Scratch) as a starter, with its repo and YouTube series. Before you build a full-stack application like nanochat, make sure you understand the core GPT-style model.
If you need to ship/buy products: Treat nanochat as reference code, not a production shortcut. For “my data, my chatbot,” your path is still open-weights + RAG + eval harness over trying to brute-force a tiny model into acting like GPT-5.
If you follow hype: The $100 line is a hook; the actual repository is a course capstone. Read the README before retweeting the hot takes.

Subscribe to this publication

No spam, no sharing to third party. Only you and me.