AI EducademyAIEducademy
🌳

AI基础

🌱
AI 种子

从零开始

🌿
AI 萌芽

打好基础

🌳
AI 枝干

付诸实践

🏕️
AI 树冠

深入探索

🌲
AI 森林

精通AI

🔨

AI精通

✏️
AI 草图

从零开始

🪨
AI 雕刻

打好基础

⚒️
AI 匠心

付诸实践

💎
AI 打磨

深入探索

🏆
AI 杰作

精通AI

🚀

职业准备

🚀
面试发射台

开启你的旅程

🌟
行为面试精通

掌握软技能

💻
技术面试

通过编程轮次

🤖
AI与ML面试

ML面试精通

🏆
Offer与未来

拿下最好的Offer

查看所有学习计划→

实验室

已加载 7 个实验
🧠神经网络游乐场🤖AI 还是人类?💬提示实验室🎨图像生成器😊情感分析器💡聊天机器人构建器⚖️伦理模拟器
🎯模拟面试进入实验室→
学习旅程博客
🎯
关于

让AI教育触达每一个人、每一个角落

❓
常见问题

Common questions answered

✉️
Contact

Get in touch with us

⭐
Open Source

在 GitHub 上公开构建

立即开始
AI EducademyAIEducademy

MIT 许可证。开源项目

学习

  • 学习计划
  • 课程
  • 实验室

社区

  • GitHub
  • 参与贡献
  • 行为准则
  • 关于
  • 常见问题

支持

  • 请我喝杯咖啡 ☕
  • 服务条款
  • 隐私政策
  • 联系我们
AI & 工程学习计划›🌿 AI 萌芽›课程›训练AI模型
🏋️
AI 萌芽 • 入门⏱️ 15 分钟阅读

训练AI模型

Training AI Models

You now know that neural networks learn by adjusting weights and biases. But how does the full training process actually work? How do you know when a model has learned enough - or too much? In this lesson, we will walk through the complete training journey.

The Training Loop

Training an AI model follows a cycle that repeats over and over:

  1. Predict - Feed data through the model and get a prediction.
  2. Compare - Check how far the prediction is from the correct answer.
  3. Adjust - Update the weights to reduce the error.
  4. Repeat - Do it again with the next batch of data.

This loop runs thousands or even millions of times. Each repetition nudges the model slightly closer to the right answers.

A circular diagram showing the training loop: Predict, Compare, Adjust, Repeat, with arrows connecting each step in a cycle
The training loop is the heartbeat of AI learning - predict, compare, adjust, and repeat.
🤯

Training GPT-4 reportedly cost over $100 million in computing power alone. The training loop ran across thousands of specialised chips for months.

Loss Functions: How Wrong Is the Model?

After each prediction, we need a way to measure how wrong the model was. This measurement is called the loss (or cost), and the formula that calculates it is the loss function.

  • Low loss = the prediction was close to the correct answer.
  • High loss = the prediction was far off.

Think of it like a dartboard. The bullseye is the correct answer. The loss is the distance from where your dart landed to the bullseye. The goal of training is to minimise that distance over time.

Common loss functions include:

  • Mean Squared Error (MSE) - measures the average squared distance between predictions and actual values. Used for predicting numbers.
  • Cross-Entropy Loss - measures how well predicted probabilities match the true categories. Used for classification tasks.
🧠小测验

What does a loss function measure in AI training?

第 4 课,共 16 课已完成 0%
←神经网络入门

Discussion

Sign in to join the discussion

建议修改本课内容

Epochs: How Many Times Through the Data?

One complete pass through the entire training dataset is called an epoch. Training typically involves many epochs - the model sees the same data multiple times, getting slightly better each round.

  • Epoch 1: The model makes many mistakes; loss is high.
  • Epoch 10: The model has improved significantly; loss is dropping.
  • Epoch 50: Improvements slow down; the model is nearing its best.
  • Epoch 200: The model might start memorising - which brings us to our next topic.
🤔
Think about it:

Revising for an exam is like running epochs. The first read-through is confusing, but each review builds understanding. However, if you re-read the same notes a hundred times, you might memorise the exact wording without truly understanding the concepts. AI has the same problem.

Overfitting: The Student Who Memorises

Overfitting is one of the most common problems in AI training. It happens when the model learns the training data too well - including its noise and quirks - and fails to perform on new, unseen data.

Imagine a student who memorises every past exam paper word for word. They score perfectly on old papers but struggle when the questions change even slightly. The student has not learned the subject - they have memorised the answers.

Signs of overfitting:

  • Training accuracy is very high (e.g., 99%).
  • Performance on new data is much worse (e.g., 75%).
  • The model has essentially memorised the training examples.
💡

The goal of training is not to score perfectly on data the model has already seen. It is to perform well on data it has never seen before. That is the true test of learning.

Underfitting: The Student Who Doesn't Study

The opposite problem is underfitting. This happens when the model has not learned enough from the data. It performs poorly on both training data and new data.

Causes of underfitting include:

  • The model is too simple for the complexity of the problem.
  • Training stopped too early (not enough epochs).
  • The features in the data are not informative enough.

If overfitting is like memorising past papers, underfitting is like walking into the exam having barely opened the textbook.

🧠小测验

A model scores 98% accuracy on training data but only 60% on new data. What is the most likely problem?

Validation and Test Sets

To detect overfitting and underfitting, we split our data into three parts:

| Set | Purpose | When Used | |-----|---------|-----------| | Training set | The model learns from this data | During training | | Validation set | Used to check progress and tune settings | During training | | Test set | Final evaluation on completely unseen data | After training |

A common split is 70% training, 15% validation, and 15% test. The model never sees the test set until the very end - it is the final exam.

🤔
Think about it:

The validation set is like a practice test you take between study sessions. It tells you how well you are learning without spoiling the real exam. If your practice test scores start dropping while your study notes scores keep rising, you know something is wrong.

When to Stop Training

Knowing when to stop is crucial. Train too little and the model underfits. Train too much and it overfits. The sweet spot is where validation loss stops improving.

A technique called early stopping automates this:

  1. Monitor the validation loss after each epoch.
  2. If it has not improved for a set number of epochs (called patience), stop training.
  3. Roll back to the weights from the best epoch.

This prevents the model from going past the point of useful learning and slipping into memorisation.

🧠小测验

What is 'early stopping' in AI training?

🤯

Some modern training runs use a technique called learning rate scheduling, which gradually reduces how much the weights change with each step - like taking smaller and more careful steps as you approach the summit of a mountain.

Key Takeaways

  • The training loop repeats: predict → compare → adjust → repeat.
  • A loss function measures how far predictions are from the truth.
  • An epoch is one full pass through the training data.
  • Overfitting means memorising data; underfitting means not learning enough.
  • Data is split into training, validation, and test sets.
  • Early stopping prevents training from going too far.

In the final lesson, we will explore the ethical dimensions of AI - bias, fairness, privacy, and what responsible AI looks like.