Skip to content

insert

Consolidated into: ML Overview — Complete.md (foundations section) This note preserved as hands-on worked example.

[[Types of nets]]

[[dump]] Since you are an A-Level student who codes "by hand" and has a background in Physics and Math, the best way to understand this is to build a minimalist "Brain" in about 30 lines of Python.

We are going to solve a simple Astronomy-themed problem: Is this star a "Giant"?

  • Input 1: Temperature (Normalized 0 to 1)

  • Input 2: Luminosity (Normalized 0 to 1)

  • Output: 0 (Dwarf) or 1 (Giant)

The 15-Minute "Hand-Coded" Experiment

Copy this into a Python file. I’ve used numpy only for the matrix math to keep it clean, but the logic is entirely manual.

Python

import numpy as np

# 1. The Data: [Temperature, Luminosity] -> Is it a Giant?
X = np.array([[0.1, 0.1], [0.2, 0.3], [0.8, 0.9], [0.9, 0.8]]) # Inputs
y = np.array([[0], [0], [1], [1]])                            # Labels

# 2. Setup: 2 Inputs -> 3 Hidden Neurons -> 1 Output
np.random.seed(42)
weights1 = np.random.rand(2, 3) # Weights for Layer 1
weights2 = np.random.rand(3, 1) # Weights for Layer 2

# Activation function (Sigmoid)
def sigmoid(x): return 1 / (1 + np.exp(-x))
def sigmoid_derivative(x): return x * (1 - x)

# 3. The Training Loop (The "Gardening")
for epoch in range(5000):
    # --- FORWARD PASS (The Guess) ---
    layer1 = sigmoid(np.dot(X, weights1))
    output = sigmoid(np.dot(layer1, weights2))

    # --- BACKPROPAGATION (The Blame Game) ---
    # How much did we miss by?
    error = y - output

    # Calculate how to nudge weights2 (Output layer)
    d_weights2 = np.dot(layer1.T, error * sigmoid_derivative(output))

    # Calculate how to nudge weights1 (Hidden layer) using the Chain Rule
    error_layer1 = np.dot(error * sigmoid_derivative(output), weights2.T)
    d_weights1 = np.dot(X.T, error_layer1 * sigmoid_derivative(layer1))

    # Update the weights (Gradient Descent)
    weights2 += d_weights2
    weights1 += d_weights1

# 4. The Result
print("Predictions after training:")
print(output)

What just happened? (The Physics Intuition)

  1. Initialization: The weights started as random noise. The network had no idea that "high luminosity = Giant."

  2. The Forward Pass: This is just a series of dot products. In physics terms, think of it as a signal propagation. The temperature and luminosity signals travel through the "wires" (weights), getting amplified or dampened.

  3. The Error: We calculated the delta between the "Observed" (the label) and "Calculated" (the prediction).

  4. The Backward Pass (Calculus in action): * We used the Chain Rule ($\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}$) to find out exactly how much each individual weight contributed to that error.

    • If a weight in Layer 1 was sending a "This is a Giant" signal for a Dwarf star, the math produced a negative gradient, "pulling" that weight down.

Why this is different from Go/C# Programming

In your Space Colony sim, you might write:

if star.luminosity > 0.7: type = GIANT

In the code above, you never wrote that rule. The loop ran 5,000 times, and the weights "settled" into a state where they naturally mimic that rule. If you look at weights1 after training, you'll see that some neurons became very sensitive to the luminosity column—they self-assigned that role.

Next Step

Try changing the input data X to something that doesn't follow a simple rule (like a checkerboard pattern). You'll notice that with only 3 neurons in the hidden layer, it might struggle. If you increase it to 10 neurons, it will suddenly "get it."

Would you like me to explain how we use the "loss function" to measure exactly how "wrong" the network is during those 5,000 steps?

Neural Networks from Scratch

This video provides a visual walkthrough of building a neural network from the ground up, which helps clarify how the mathematics translates into code.

<!--In the code I gave you, the neurons were definitely there, but they were "hidden" inside the matrix math (np.dot). For a programmer who likes to build things by hand, matrix math can feel like a "black box" because it processes everything at once.

To properly understand a neuron, you have to look at it as an individual object with three specific parts. Think of it like a Logic Gate with a Volume Knob.


Anatomy of a Single Neuron

If we were to write this in a more "manual" way (similar to how you might structure a system in Go or C#), a single neuron looks like this:

  1. Weights (The Importance): Every input has a weight.1 If a neuron is looking for "High Temperature," the weight for the temperature input will be a high positive number.

  2. Bias (The Threshold): This is a number added at the end. It’s like the neuron saying, "I need at least this much signal before I even care."

  3. Activation Function (The Firing): This decides the final output (usually between 0 and 1).2

Hand-Coding a "Neuron" Class

Since you use C# and Go, you probably prefer seeing the "structure" of the data. Here is how that previous matrix math looks when you break it down into a single "Neuron" entity:

Python

import math

class Neuron:
    def __init__(self, input_count):
        # Initialize with random weights and a bias
        self.weights = [0.5 for _ in range(input_count)] 
        self.bias = 0.1

    def forward(self, inputs):
        # 1. Weighted Sum (Dot Product done by hand)
        total = 0
        for i in range(len(inputs)):
            total += inputs[i] * self.weights[i]

        # 2. Add Bias
        total += self.bias

        # 3. Activation (The "Sigmoid" function)
        # This squashes the number to be between 0 and 1
        return 1 / (1 + math.exp(-total))

# Example: A star with [Temperature: 0.8, Luminosity: 0.9]
my_neuron = Neuron(input_count=2)
print(f"Neuron Output: {my_neuron.forward([0.8, 0.9])}")

How they become a "Network"

In your A-Level Physics, you deal with circuits. A Neural Network is just a series of these neurons wired together:

  1. The Input Layer: This isn't really neurons; it’s just your data (Temperature, Brightness).

  2. The Hidden Layer: A list of Neuron objects. Each one receives all the inputs.

    • Why? Because one neuron might decide to be the "Hot Star Specialist" while another becomes the "Bright Star Specialist."
  3. The Output Layer: One or more neurons that take the outputs of the previous layer and give you the final answer.


The "Aha!" Moment

The reason we use Matrices (like in my first code snippet) instead of Classes (like the one above) is simply Speed.

  • Doing a for loop over 2,000 neurons in Python or C# is slow.

  • Your GPU (the GTX 1650 you mentioned) is designed to do thousands of "Weighted Sums" at the exact same time. Matrix math is just a way to tell the computer: "Hey, do all 256 neurons at once."

Try this for 5 minutes:

Go back to the first code I gave you and look at this line:

weights1 = np.random.rand(2, 3)

That line is literally creating 3 neurons, each with 2 weights.

  • If you change that 3 to a 100, you have just built a layer with 100 neurons.

  • The rest of the math stays exactly the same.

Since you’re into high-performance computing (DOTS/ECS), would you like to know how we actually "flatten" these neurons into memory-efficient arrays so they run fast on a GPU?