james974
james974 6d ago β€’ 0 views

Simple Linear Regression Sample Code in Python

Hey there! πŸ‘‹ Ever wondered how to predict something using just a line? Simple Linear Regression in Python is your answer! I remember being so confused by it at first, but once I understood the basics, it became super useful for all sorts of projects. Let's break down the code and see how it works! πŸ“ˆ
πŸ’» Computer Science & Technology
πŸͺ„

πŸš€ Can't Find Your Exact Topic?

Let our AI Worksheet Generator create custom study notes, online quizzes, and printable PDFs in seconds. 100% Free!

✨ Generate Custom Content

1 Answers

βœ… Best Answer

πŸ“š What is Simple Linear Regression?

Simple Linear Regression is a statistical method used to model the relationship between a single independent variable (predictor) and a dependent variable (response) by fitting a linear equation to observed data. In simpler terms, it's about finding the best-fit line through a scatterplot of data points.

πŸ“œ A Brief History

The concept of linear regression dates back to the early 19th century, with contributions from scientists like Carl Friedrich Gauss and Adrien-Marie Legendre. Gauss developed the method of least squares, a fundamental technique used in linear regression, around 1795. Sir Francis Galton later popularized regression in the context of studying hereditary traits.

✨ Key Principles of Simple Linear Regression

  • πŸ“Š Linear Relationship: Assumes a linear relationship between the independent and dependent variables. This means the change in the dependent variable is constant for every unit change in the independent variable.
  • πŸ“‰ Least Squares Method: The goal is to minimize the sum of the squares of the differences between the observed values and the values predicted by the regression line. This is done by finding the values for the slope ($b$) and intercept ($a$) that minimize the residual sum of squares (RSS). The equation for the line is: $y = a + bx$, where $y$ is the dependent variable and $x$ is the independent variable.
  • 🚫 Independence of Errors: Assumes that the errors (residuals) are independent of each other. This means that the error for one data point does not influence the error for another data point.
  • 🍎 Homoscedasticity: Assumes that the variance of the errors is constant across all levels of the independent variable. In other words, the spread of the residuals should be roughly the same across the range of $x$ values.
  • 🌱 Normality of Errors: Assumes that the errors are normally distributed. This assumption is important for hypothesis testing and constructing confidence intervals.

🐍 Simple Linear Regression Code Example in Python

Here's a basic example using the statsmodels library in Python.


import numpy as np
import statsmodels.api as sm

# Sample data
x = np.array([1, 2, 3, 4, 5])  # Independent variable
y = np.array([2, 4, 5, 4, 5])  # Dependent variable

# Add a constant to the independent variable (for the intercept)
X = sm.add_constant(x)

# Fit the linear regression model
model = sm.OLS(y, X)
results = model.fit()

# Print the results
print(results.summary())

# To get the intercept and coefficient (slope):
intercept = results.params[0]  #This is the a value
coefficient = results.params[1] #This is the b value

print(f"Intercept: {intercept}")
print(f"Coefficient: {coefficient}")

# Making predictions
new_x = np.array([6, 7, 8])
new_X = sm.add_constant(new_x)
predictions = results.predict(new_X)
print(f"Predictions: {predictions}")

βš™οΈ Explanation of the Code

  • πŸ“¦ Import Libraries:
    • numpy for numerical operations (creating arrays).
    • statsmodels.api for statistical modeling, including linear regression.
  • πŸ”’ Create Sample Data:
    • x: Independent variable (predictor).
    • y: Dependent variable (response).
  • βž• Add Constant:
    • sm.add_constant(x) adds a column of ones to the independent variable array. This is necessary to estimate the intercept in the linear regression model.
  • πŸ§ͺ Fit the Model:
    • sm.OLS(y, X) creates an OLS (Ordinary Least Squares) model.
    • results = model.fit() fits the model to the data using the least squares method.
  • πŸ“ˆ Print Summary:
    • results.summary() prints a summary of the regression results, including coefficients, standard errors, t-statistics, p-values, and R-squared.
  • πŸ’‘ Extract Coefficients:
    • results.params[0] extracts the intercept.
    • results.params[1] extracts the coefficient (slope).
  • 🎯 Make Predictions:
    • Create new independent variable values (new_x).
    • Add a constant to the new independent variable values (new_X).
    • Use results.predict(new_X) to predict the corresponding dependent variable values.

🌍 Real-world Examples

  • 🌑️ Temperature and Ice Cream Sales: Predicting ice cream sales based on temperature. The higher the temperature, the more ice cream you sell!
  • πŸ“° Advertising Spend and Sales: Predicting sales based on advertising expenditure. More advertising usually leads to more sales.
  • ⏳ Years of Experience and Salary: Predicting salary based on years of experience. Generally, more experience correlates with a higher salary.

πŸŽ“ Conclusion

Simple Linear Regression is a powerful tool for understanding and predicting relationships between variables. By using Python and libraries like statsmodels, you can easily implement and analyze linear regression models. Understanding the principles and assumptions behind linear regression is crucial for interpreting the results and making informed decisions.

Join the discussion

Please log in to post your answer.

Log In

Earn 2 Points for answering. If your answer is selected as the best, you'll get +20 Points! πŸš€