Friends,
In September I wrote about signing my 6th grader and myself up for courses on mathacademy.com. It’s been a month and we’re addicted and competing with each other to level up in our respective leagues by gaining XP. A unit of XP approximates “one minute of focused effort by a serious but imperfect student”.
[I’ve turned a bunch of readers onto this site just as I was turned on to it by another reader and now I got peeps texting me questions about it or telling me about their kids progress. You love to see it. Random side note — I have a good friend who just moved from my neighborhood to Austin because he’s deep in the education/AI intersection and the weird city is the scenius for education experimentation. I mentioned it to him and let’s just say he knew all about it from different angles. What he told me only got me even more stoked about what mathacademy is doing fwiw.]
In that post, I pasted links to 30 articles that I planned to read by the site’s chief quant Justin Skycak after already doing a fair bit of reading on the blog. I’ve plowed thru the 30 articles and then some, which is still just a fraction of what’s on there.
I’m personally highly interested in the entire topic of using AI to develop talent and learn at rates that were previously unthinkable. I have a large unfinished document with years of insights that I’ve pulled together from various sources that probably won’t see the light of day. For education I’m a big fan of writers like Scott H. Young, Cedric Chin, Matt Bateman, and Freddie deBoer. You can search the substack for all the times I’ve referenced their work and I have plenty more in backlog. I’ve also harped on the degree to which SIG’s education was extremely well-mapped out from a pedagogical point of view. It wasn’t until I heard Todd Simkin explain the educational influences that informed how they taught did I appreciate the extent to which education theory underpinned their methods.
See:
🔗Educational Ideas Inspired By Seymour Papert’s Constructionism (Moontower)
🔗Notes From Todd Simkin On The Knowledge Project (Moontower)
🔗General & Childhood Education Articles (Moontower)
I’m adding Justin to my list of must-reads. After spending most of Sunday with the blog, I’ve synthesized a much more condensed version of Principles of Learning except it’s fully based on Justin’s insights.
I reached out to him when I first discovered the site and made my interest in what he’s doing as plain as possible. I told him:
I think being born on 3rd is to get exposure to someone when you are young who shows just how self imposed our speed limits are.
He hadn’t heard it put that way before.
The wealth you give a youth is self-efficacy. A chance to match their abilities to the needs of communities they find themselves in as they get older. Autonomy and confidence through competence.
When I say “speed limit” I’m not referring to speed only, or even necessarily. It’s more about limits in general. In athletics, you can’t be Lebron no matter what you do. But whatever your limit is, it’s further than you think. It goes without saying that finding your limit requires brutal effort and commitment…but however far that gets you, personalized instruction will get you even further.
If a great teacher/mentor/coach will get you further than the frontier that caps out at a given level of effort, then that role has insane leverage. The very act of pushing through a previously-conceived frontier will increase your motivation and effort as you see what’s possible.
There was a Washington Post article several years ago referring to “America’s most advanced math program” in Pasadena. The kids were crushing the AP Calc BC exam in 8th grade.
Who were the teachers?
The founders of mathacademy.com
The Math Academy began as a tutoring program run by husband-and-wife duo Jason and Sandy Roberts before being formally adopted into the PUSD curriculum in 2017.
Seen narrowly, mathacademy is an AI program that helps you learn math faster.
I think this is to miss what’s coming.
The instruction portion of the personalized coach is being automated.
I’m fairly convinced that we aren’t too far from knowledge not just being democratized (I mean Wikipedia already exists) but structured for delivery on incredibly effective, personalized rails.
Before someone’s reactance reflex gets all buzzy, I don’t mean “education” will be solved by a robot. Instruction is simply one component of education. Motivation, support, guidance, as well the type of story-telling and conversation that relates classroom learning to the world and others is as human-based an activity as a warm hug. But if the price for personalized instruction craters, the secondary effects are going to be large and visible.
At scale, we are going to find out just how many kids are capable of finishing Calc BC by grade 8 or publishing a novel in middle school. We hear those stories now and we dismiss them as “genius” or “privileged”.
But what if a low price for personalized instruction tells us we’re wrong about this? There will always be examples of genius or privilege. But if stories of insane achievement start multiplying amongst broken-English immigrants or other groups who are not advantaged in any way EXCEPT in motivation than you’ll know that the things Justin is writing about turned out to be true.
The price of personalized instruction falling is not a panacea. The cost is only a bottleneck after basic needs like stability and safety are met. But the cost is an active bottleneck for all but the rich once those needs are met. Even expensive schools are only incrementally better on truly personalized instruction (their primary advantage might be the compression of the classroom range to a higher functioning average but that’s not the same as personalized instruction so much as a release from tolerating a small number of disproportionally disruptive students).
I’m fascinated by mathacademy because of what it telegraphs — a future of cheap personalized instruction. I’m not picturing slicker edtech apps here. This is a glimpse of something different.
Libraries were free. The internet is free, convenient, and wider reaching. Sal Khan is a prophet who built on its rails. Well, the tracks are being upgraded.
The trains are going to go faster.
The full document can be found as a moontower guide:
It’s a living document that I’ll add to over time.
This condensed version hits most of the highlights based on what I’ve read so far. I pontificate like a blowhard at the end a bit more.
Maximizing the Learning Rate: A Neuroscience-Informed Approach to Education
The objective function of educational strategy outlined below is to maximize the learning rate—helping students acquire and retain knowledge more effectively. There are certainly great programs for independent learning out there but the objective in this discussion is to leverage technology and cog sci to progress through levels of mastery faster.
What Neuroscience Has Taught Us About the Brain
These are some of the most durable findings in cognitive science.
Neuroplasticity: The brain’s ability to rewire itself through new experiences is one of the most significant findings in neuroscience. Neuroplasticity means that the brain continually adjusts its neural connections in response to new learning. This allows learners to develop new skills and adapt to challenges. Methods like deliberate practice particularly the "effortful repetition" and "successive refinement" aspects repeatedly strengthen neural pathways until tasks become second nature (Talent Development vs Traditional Schooling).
Dopamine and Motivation: Neuroscience has shown that dopamine, a neurotransmitter, plays a critical role in motivation and reward-based learning. When learners experience success, dopamine is released, reinforcing the behavior and encouraging continued effort. This makes motivation a crucial component of the learning process, as it directly influences how willing a learner is to persevere through challenges.
Working Memory and Its Limitations: The brain’s working memory, or the ability to hold and manipulate information temporarily, is limited. Overloading this system can impede learning, as the brain can only focus on a few pieces of information at once. Techniques like chunking—breaking down complex tasks into smaller, more manageable units—can help mitigate this overload (When Should You Do Math in Your Head vs Writing It Out on Paper?).
The Science of Forgetting: One of the most critical insights from cognitive psychology is the concept of forgetting curves. The theory, which dates back to Hermann Ebbinghaus’s pioneering research, shows that learners forget newly acquired information rapidly unless there is some form of reinforcement. The brain’s natural tendency to forget is often visualized in a forgetting curve, which steeply declines in the hours or days after learning.
Forgetting Curves and Memory Decay: Ebbinghaus’s forgetting curve demonstrates that without review or rehearsal, retention of new knowledge drops quickly over time. However, the rate of forgetting slows down when learners engage in retrieval practice and spaced repetition—both of which can flatten the curve, leading to more durable retention.Spaced Repetition Leads to Automaticity: Over time, repeated retrieval practice pushes learners toward automaticity—the ability to recall information effortlessly. Once information is retrieved enough times across spaced intervals, it becomes deeply embedded in long-term memory. Efficiency is achieved through repeated activation and myelination – a process where neural pathways are coated with a substance called myelin, increasing the speed and efficiency of signal transmission.
This is the key loop:
Retrieval practice > Automaticity > Reduced demand on working memory >
The learner frees up cognitive resources for more complex tasks, facilitating better problem-solving and higher-order thinking."Automaticity frees up cognitive resources that would otherwise be consumed by basic recall tasks, allowing for higher-order cognitive tasks to take place in the working memory.”
Implications for Implementation
The implications of neuroscience and research on forgetting curves for learning are vast. Here’s how these insights translate into effective learning strategies:
Retrieval Practice and Minimizing Forgetting: The act of retrieving information from memory, rather than passively reviewing material, significantly boosts retention. Each successful retrieval attempt strengthens neural pathways and makes the knowledge more durable. As learners engage in retrieval, they disrupt the forgetting curve and prolong the retention of knowledge (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).
Spaced Repetition for Long-Term Retention: By structuring review sessions at increasingly spaced intervals, learners allow time for memory consolidation. This reduces the steep decline of the forgetting curve, especially in the early stages of learning. Over time, the intervals between repetitions can be extended without significant loss in retention, enabling efficient long-term learning. The use of spaced repetition systems (SRS) has demonstrated significant improvements in student performance (Optimized, Individualized Spaced Repetition in Hierarchical Knowledge Structures).
Common Misconceptions in Learning
It’s easy to fall into widely accepted beliefs about how people learn, but research has debunked many of these ideas. Here are a few myths that might surprise you:
Learning Styles: Contrary to popular belief, the idea that individuals have specific "learning styles" (e.g., visual, auditory, kinesthetic) and that teaching should be tailored to these styles is unsupported by research. While students may have preferences, these preferences do not significantly improve learning outcomes. Instead, using varied teaching methods that engage multiple senses enhances learning for all students (Why is the EdTech Industry So Damn Soft?). Veritasium has also called this the “biggest myth in education”.
The Myth of Productive Struggle: While allowing learners to struggle through difficult problems might seem beneficial, research has shown that this is often counterproductive, particularly for novices. Without proper guidance, prolonged struggle leads to frustration and disengagement. Scaffolding and explicit instruction provide the necessary support to avoid cognitive overload and enable meaningful progress (What’s the Best Way to Teach Math: Explicit Instruction or Less Guided Learning?).
Discovery Learning vs. Direct Instruction: The idea that students should learn concepts through self-discovery has been largely debunked, especially for beginners. Direct instruction, which provides clear guidance and support, has proven far more effective in most learning scenarios. Discovery learning works well for experts but can leave novices overwhelmed and unproductive, a paradoxical finding known as the “expertise reversal effect”. (The Pedagogically Optimal Way to Learn Math).
The Illusion of Comprehension: Learners often mistake familiarity with material for true understanding—a phenomenon known as the illusion of comprehension. Just because something feels familiar doesn't mean the learner can apply it effectively. Combatting this requires practices like retrieval practice and interleaving, which force deeper engagement with the material (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).
What Pedagogy Research Has Taught Us
Pedagogy research provides practical strategies that align with neuroscience insights, helping us understand how to optimize learning environments:
Deliberate Practice: One of the most well-established findings in educational research is the importance of deliberate practice. Unlike passive or rote learning, deliberate practice focuses on honing specific skills through effortful repetition and immediate feedback. This approach helps students achieve automaticity, where foundational skills become second nature and free up cognitive resources for more complex problem-solving. This is why “deliberate practice” is regarded as the most effective training technique across talent domains (The Pedagogically Optimal Way to Learn Math).
Worked Examples to Reduce Cognitive Load: Especially in subjects like mathematics, worked examples are invaluable for novice learners. By showing step-by-step problem-solving processes, worked examples reduce cognitive load, allowing learners to focus on understanding the process rather than inventing solutions. This strategy is effective in reducing overwhelm, a key barrier to learning (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).
Active Learning for Deeper Understanding: Research consistently shows that active learning—engaging students in activities like problem-solving, discussion, and teaching others—leads to better retention and understanding than passive learning methods like lectures. However, this active engagement must be paired with direct instruction, especially for novices, to prevent cognitive overload (Why is the EdTech Industry So Damn Soft?).
Interleaving Practice: Interleaving, or mixing different topics or skills within a study session, forces the brain to continually retrieve and apply information, strengthening neural connections. While it may feel harder for learners, this desirable difficulty improves long-term retention and the ability to transfer knowledge to new contexts (Which Cognitive Psychology Findings Are Solid, That Can Be Used to Help Students Learn Better?).
Connecting It All: The Flywheel of Competence, Confidence, and Motivation
When neuroscience and pedagogy principles are applied in tandem, they create a reinforcing cycle that propels students toward continuous growth and mastery:
Competence: Effective learning techniques, such as deliberate practice and retrieval practice, build competence. As learners master fundamental skills, they achieve automaticity, allowing them to perform basic tasks effortlessly, freeing up mental resources for tackling more advanced problems (Automaticity for Cognitive Efficiency).
Confidence: With growing competence comes confidence. When learners see themselves succeeding—whether it's mastering a math concept or improving in a skill—they are more likely to tackle new challenges with a positive mindset. This confidence feeds into their willingness to engage with difficult tasks (Recreational Mathematics: Why Focus on Projects Over Puzzles).
Motivation: Confidence breeds motivation. As students become more confident in their abilities, they are more driven to continue learning. This motivation reinforces their engagement in deliberate practice, completing the flywheel and leading to greater competence over time. Accountability, whether through structured learning programs or paid educational platforms, also plays a role in keeping learners committed to their goals.
Key points and clarifications from select posts
Recreational Mathematics: Why Focus on Projects Over Puzzles (2 min read)
There’s only so much fun you can have trying to follow another person’s footsteps to arrive at a known solution. There’s only so much confidence you can build from fighting against a problem that someone else has intentionally set up to be well-posed and elegantly solvable if you think about it the right way.
The Situation with AI in STEM Education (11 min read)
The major limitation of LLMs in education is their reliance on student-initiated questions. Effective teachers don't simply answer questions; they guide students through a structured learning process, scaffolding information and addressing knowledge gaps. LLMs, like ChatGPT, primarily respond to prompts, lacking the pedagogical ability to anticipate a student's needs or direct their learning path.
The promise of AI in education overemphasizes the role of "explanation". Scaffolding and learning management are equally important. He cautions against prioritizing AI's ability to engage in conversational dialogue over its capacity to deliver well-structured, personalized learning experiences.
Optimized, Individualized Spaced Repetition in Hierarchical Knowledge Structures (22 min read)
Theoretical Maximum Learning Efficiency In physics, nothing can travel faster than the speed of light. It is the theoretical maximum speed that any physical object can attain. A universal constant. In the context of spaced repetition, there is an analogous concept: theoretical maximum learning efficiency which posits that in a perfectly encompassed body of knowledge, it's theoretically possible to achieve mastery through continuously learning new, progressively advanced topics without ever explicitly reviewing old material. This idea, while theoretical, underscores the power of leveraging knowledge interconnectedness.
Importance of Encompassing Graphs (as opposed to prerequisite graphs) : Unlike prerequisite graphs which show learning dependencies, encompassing graphs map how practicing advanced topics reinforces prior knowledge. Constructing these graphs is a laborious, manual process requiring significant domain expertise, highlighting the importance of expert-designed learning pathways.Talent Development vs Traditional Schooling (12 min read)
Orthogonality of Talent Development and Schooling: Traditional schooling, with its age-based grouping and standardized curricula, often fails to effectively nurture talent. This stark contrast emphasizes the need for specialized approaches outside the traditional classroom setting. Talent development is not only different from schooling, but in many cases completely orthogonal to schooling: "For one portion of our sample, talent development and schooling were almost two separate spheres of their life. ... Usually the student made the adjustments, resolving the conflict by doing all that was a part of schooling and then finding the additional time, energy, and resources for talent development. ... Mathematicians found and worked through special books and engaged in special projects and programs outside of school. Sometimes the schools or particular teachers made minor adjustments to dissipate the conflict. Mathematicians were sometimes excused from a class they were too advanced for and allowed to work on their own in the library. Sometimes they were accelerated one grade as a concession to their outside learning. ... Whether the individual or the school made these adjustments, it was clear that these adjustments minimized conflict but did little to assist in talent development. The individual was able to work at both schooling and talent development, although with minimum interaction. ... Talent development and schooling were isolated from one another. Schooling did not assist in talent development, but in these instances it did not interfere with talent development."
Individualized Instruction in Talent Development: Unlike the group-focused approach of schools, talent development thrives on personalized instruction, tailoring learning tasks to individual needs and ensuring mastery before moving on. This distinction underscores the importance of personalized learning pathways in maximizing potential.
A useful reminder from You Will Never Achieve Your Goals Unless You Transform Yourself Into a Person Who is Capable of Achieving Them:
You want to do something that sets you apart? You’re going to have to work harder than most.
Actually, let’s re-print the entire post:
The #1 confusion that I hear when people ask me about math, ML/AI, startups, etc., is they think there’s a way to achieve outsized success without putting in an outsized amount of work.
You want to do something that sets you apart? You’re going to have to work harder than most. There is no way around it.
You think you can get good at math by watching YouTube videos?
Develop cutting-edge ML/AI by asking ChatGPT to code it up for you?
Put a dent in the universe working 40 hours per week?
If you think any of those things, then you will never achieve your goals because you will never transform yourself into a person who is capable of achieving them.
And guess what? It’s not enough to simply work hard.
To achieve outsized success, it’s critical to not only put in enough time/effort, but also to work productively.
You have to work hard AND work smart.
And furthermore, work in a direction where you have some competitive advantage (or, at least, you’re not at a disadvantage).
Part of this work involves engaging in activities that maximize the likelihood of you getting some lucky breaks.
You have to work to maximize your luck surface area.
I have friend from my college days that always used to say ridiculous catch-phrases with his personal mix of cheekiness and seriousness. Like if you didn’t for the 5th set of squats he’d just dismiss you with “I guess I’ll just take you off my list of successful people for today”.
It was a phrase a bunch of us still parrot to this day in a joking way. Oh you didn’t moisturize after your shower? Off the list. Didn’t drink your coffee black? Just go to bed now and try again tomorrow.
The grindset earns its parody. But we don’t mock mediocrity because it’s suffered enough. We apologize for it readily but rarely our own. But we might bend over backward to apologize for others’ mediocrity. The reasons can range from genuine concern to signaling to grift. I don’t want to paint the motivation with a broad brush. Regardless of the motivation, excessive sympathizing without actually getting your hands dirty is just patronizing. If you care, then help.
A lot of education problems are not education problems so much as just family or stability problems. That can range from abuse to just having parents that are consistently crap decision-makers. Public school, as maligned as it often is, can be a refuge. A chance to get inspired by a great teacher or an authority figure whose influence counteracts the brainworms that might come from home life.
If you come from stability, this is hard to see and might even sound offensive. But kids are not possessions. It’s why we have laws to protect them but you’re free to stick your silverware in the microwave if you want. Where the lines of interference lie are legal matters (and by extension political — it comes with living in a representative government…shrug).
My belief is that education experimentation is good because making progress on instruction is a high-leverage activity. The fact that it is not evenly distributed because the spread is rate-limited by non-educational obstacles is not a reasonable objection to innovation. Of course, you can be skeptical or bearish. Hell, the education world is mass of twisted hot metal. But resignation to extending an uninspiring status quo or accepting low standards is anything but progressive.
[There’s probably some smart-sounding argument that goes “you can’t fix education because you can’t fix society” to which you can only wonder, then what are we even doing here?]
When it comes to teaching and coaching there’s a delicate balance of toughness and love. It’s like parenting to be honest. It’s hard because it often hurts plus its mired in bureaucracy.
On the supply side, great teachers might be scarce because the right mix of tough but fair + smart is just scarce in the population and now we have to choose a subset of those people. If you want great teachers you're asking for a legion of special individuals. Attracting special individuals requires a special effort to recruit, train, support, and enable. We get what we are willing to pay for. I don’t have a full understanding of the frictions that make our spend inefficient, but addressing them is independent of trying to make inroads with technologies that improve instructional efficiency.
🍰Justin has plenty of criticism on technology by the way: Why is the EdTech Industry So Damn Soft? (11 min read)
If you don’t see technology as being more than incrementally useful (at least on a longer-time scale) then you’ve given up.
Because we aren’t going to get much better without it.
(Collectively at least. The resourceful are going to have robot tutors if they can’t afford human ones. When you get down to it, it’s your move either way.)
Stay Groovy
☮️
Need help analyzing a business, investment or career decision?
Book a call with me.
It's $500 for 60 minutes. Let's work through your problem together. If you're not satisfied, you get a refund.
Let me know what you want to discuss and I’ll give you a straight answer on whether I can be helpful before we chat.
I started doing these in early 2022 by accident via inbound inquiries from readers. So I hung out a shingle through the Substack Meetings beta. You can see how I’ve helped others:
Moontower On The Web
📡All Moontower Meta Blog Posts
👤About Me
Specific Moontower Projects
🧀MoontowerMoney
👽MoontowerQuant
🌟Affirmations and North Stars
🧠Moontower Brain-Plug In
Curations
✒️Moontower’s Favorite Posts By Others
🔖Guides To Reading I Enjoyed
🛋️Investment Blogs I Read
📚Book Ideas for Kids
Fun
🎙️Moontower Music
🍸Moontower Cocktails
🎲Moontower Boardgaming
What you are referring to is what music teachers have known before AI or tech. "The more you practice the more you enjoy" that practice is something you do for proficiency not perfection. Indeed playing an instrument involves all senses and mathematical interpretation which fires up the neurons used for a 2nd language model. It's also been proven by neuroscience that it's through rhythm that we learn language. Synapses firing at millisecond precision in harmony with the body's many functions.
Unfortunately our educational system is mingled with the religion of sports so students who are musically inclined or curious are often ripped away from music programs by "sporty" parents where most extra funding goes. So few schools will hire many music instructors instead deem on teacher for the entire school. Music even at elementary levels can help students excel at all subjects.
> "At scale, we are going to find out just how many kids are capable of finishing Calc BC by grade 8 or publishing a novel in middle school."
Love this.
By the way, I believe this link is broken: 🔗Educational Ideas Inspired By Seymour Papert’s Constructionism (Moontower)