W1
Beginner 3 sessions · 6 hours · Python

Week 1: Python Fundamentals for Data Science

Topics covered: Installation, variables, data types, operators, control flow, functions, Git

Learning objectives: By the end of this week you will be able to apply python fundamentals for data science concepts to real datasets, write executable Python code for each technique, and complete both graded assignments independently.

Session 1: Python Installation, Variables and Data Types

Python is the dominant language in data science. Install Anaconda Distribution (anaconda.com) which bundles Python 3.11, Jupyter Notebook, and 250+ scientific packages. A variable is a named storage location. Python has four basic scalar types: int (whole numbers), float (decimal numbers), str (text), and bool (True/False).

student_name = 'Amara Okafor'
age = 28
height_m = 1.72
is_enrolled = True

print(type(student_name))  # <class 'str'>
print(type(age))            # <class 'int'>

# Type conversion
year_str = '2024'
year_int = int(year_str)
print(year_int + 1)  # 2025

Session 2: Operators and Control Flow

Python supports 7 arithmetic operators (+, -, *, /, //, %, **). Comparison operators return booleans (==, !=, <, >, <=, >=). Logical operators combine boolean expressions (and, or, not). Conditional execution uses if/elif/else with 4-space indentation. For loops iterate over sequences using range() or directly over lists.

# Conditional logic - loan risk classification
credit_score = 680
income = 350000  # NGN annual

if credit_score >= 750 and income >= 500000:
    risk_tier = 'Prime'
elif credit_score >= 650 and income >= 250000:
    risk_tier = 'Standard'
elif credit_score >= 550:
    risk_tier = 'Sub-prime'
else:
    risk_tier = 'Declined'

print(f'Loan decision: {risk_tier}')

# For loop with enumerate
datasets = ['titanic', 'iris', 'boston_housing']
for i, name in enumerate(datasets, start=1):
    print(f'{i}. Loading: {name}.csv')

Session 3: Functions and Git Basics

A function is a reusable named block of code. Define with def, add parameters, and use return to output a value. Default parameters allow callers to omit arguments. Git tracks changes: git add stages files, git commit saves a snapshot, git push uploads to GitHub. Every data science project must be version-controlled from day 1.

def calculate_bmi(weight_kg, height_m):
    """Calculate BMI and return value with WHO classification."""
    bmi = weight_kg / (height_m ** 2)
    if bmi < 18.5:
        category = 'Underweight'
    elif bmi < 25.0:
        category = 'Normal weight'
    elif bmi < 30.0:
        category = 'Overweight'
    else:
        category = 'Obese'
    return {'bmi': round(bmi, 2), 'category': category}

result = calculate_bmi(70, 1.75)
print(result)  # {'bmi': 22.86, 'category': 'Normal weight'}

Week 1 Assignments

Submit completed notebooks to your GitHub repository before the next session. Feedback within 48 hours.

Build a BMI calculator that validates inputs (weight 20-300 kg, height 0.5-2.5 m), returns the BMI value and WHO classification, and tests 5 different inputs. Push to a new GitHub repository.

Write celsius_to_fahrenheit(temp) and fahrenheit_to_celsius(temp) functions, then print a conversion table from 0 to 100 C in steps of 10.

Next: Week 2