Linear Regression Demo

Train a linear regression model on housing data and make predictions. This demo shows how machine learning models can predict house prices based on features like area, bedrooms, age, location, and crime rate.

Step 1: Training Data

View and edit the training data. Each row represents a house with features (area, bedrooms, age, location, crime rate) and its actual price. You can add, edit, or delete rows. If no data is provided, the system will generate synthetic data.

Area Unit:

No training data yet. Click "Train with Default Data" to generate synthetic data, or "Add Row" to create custom data.

⚠️ Train the model first

Please complete Step 1 to train the model before making predictions.

Step 2: Make Predictions

Enter house features below to predict the price using the trained model.

About Linear Regression

Linear regression is a fundamental machine learning algorithm that models the relationship between a dependent variable (house price) and one or more independent variables (features).

Model Equation

y = β₀ + β₁x₁ + β₂x₂ + ⋯ + βᵣxᵣ + ε

y: Predicted house price (dependent variable)
β₀: Intercept (base price)
β₁, β₂, ..., βᵣ: Coefficients (weights for each feature)
x₁, x₂, ..., xᵣ: Features (area, bedrooms, age, location, crime rate)
ε: Error term (residual)

Cost Function (Mean Squared Error)

J(θ) = ½i(hθ(x(i)) - y(i)

= ½i(θx(i) - y(i)

J(θ): Cost function (objective to minimize)
hθ(x(i)): Hypothesis function prediction for example i
θ: Parameter vector [β₀, β₁, β₂, ..., βᵣ]
θᵀx(i): Dot product of parameters and features (θ₀ + θ₁x₁ + ... + θᵣxᵣ)
y(i): Actual target value for example i
m: Number of training examples

How It Works

  1. Training Phase: The algorithm finds the optimal parameters θthat minimize the cost function J(θ) using gradient descent or the normal equation.
  2. Prediction Phase: Once trained, predictions are made using the model equation: ŷ = θᵀx (where ŷ is the predicted value).
  3. This Demo: Uses scikit-learn's LinearRegression, which employs the normal equation (closed-form solution) to find the optimal coefficients that minimize the mean squared error.

Academic Papers and Resources

Modern References:
Online Resources: