Python AWS Weather Data Pipeline Guide

Table of Contents

What I Built
#

I created a Python script that automates fetching weather data for 5 cities (London, New York, Amsterdam, Delhi, Oslo) from the OpenWeather API, processes it, saves it locally, and uploads it to an AWS S3 bucket (cloud storage). Think of it as a weather data factory:

Fetch raw data from OpenWeather.
Process it to keep only temperature, humidity, and weather conditions.
Save locally as JSON files.
Upload to the cloud (AWS S3) for safekeeping.

Architecture Diagram
#

How I Approached It
#

I broke the problem into small, manageable tasks and tackled them one by one. Here’s my roadmap:

1. Authentication & Setup
#

Problem: API keys and AWS credentials are sensitive!

Solution: Use .env files to store secrets (never hardcode them!).

# Load secrets from .env
  load_dotenv()
  api_key = os.getenv("API_KEY")
  bucket_name = os.getenv("S3_BUCKET_NAME")

2. Fetch Data from OpenWeather
#

Problem: How to get live weather data?

Solution: Use Python’s requests library to call the API.

def fetch_weather_data(api_key, city):
      base_url = "https://api.openweathermap.org/data/2.5/weather"
      params = {"q": city, "appid": api_key}
      response = requests.get(base_url, params=params)
      return response.json()

3. Process the Data
#

Problem: The API returns 50+ fields—I only need 4!

Solution: Extract relevant data using a dictionary.

def extract_relevant_data(data):
      return {
          "name": data.get("name"),
          "description": data["weather"][0]["description"],
          "temp": data["main"]["temp"],
          "humidity": data["main"]["humidity"]
      }

4. Save Locally
#

Problem: Organize files by city name.

Solution: Create a weather_data folder and save JSON files.

def save_to_local(data, city):
      directory = "weather_data"
      os.makedirs(directory, exist_ok=True)  # Create folder if missing
      file_path = os.path.join(directory, f"{city}_weather.json")
      with open(file_path, "w") as file:
          json.dump(data, file, indent=4)

5. Upload to AWS S3
#

Problem: Ensure the S3 bucket exists; handle errors.

Solution: Check for the bucket, create it if missing, then upload.

def bucket_exists(client, bucket_name):
      try:
          client.head_bucket(Bucket=bucket_name)
          return True
      except Exception as e:
          print(f"Error: {e}")
          return False
  def upload_to_s3(client, bucket_name, file_path, s3_key):
      client.upload_file(file_path, bucket_name, s3_key)

Why I Chose Procedural Programming (Not OOP)
#

I structured the code as a series of functions (procedural style) instead of using classes (object-oriented programming). Here’s why:

1. Simplicity
#

The script is linear: Fetch → Process → Save → Upload.

Example:

def main():
      # Step 1: Connect to AWS
      client = boto3.client('s3')
      # Step 2: Check bucket
      if not bucket_exists(client, bucket_name):
          create_bucket(client, bucket_name)
      # Step 3: Process cities
      for city in cities:
          data = fetch_weather_data(...)
          save_to_local(...)
          upload_to_s3(...)

This reads like a recipe—easy for beginners to follow!

2. Scope
#

The script does one thing: move data from Point A (API) to Point B (S3).
No need for complex class hierarchies.

3. Faster Prototyping
#

Functions let me build and test individual parts quickly.

When Would I Use OOP?
#

If the project grew (e.g., adding a dashboard, user input, or multiple data sources), I’d switch to OOP. Example:

class WeatherPipeline:
    def __init__(self, api_key, bucket_name):
        self.api_key = api_key
        self.bucket_name = bucket_name

    def fetch_data(self, city):
        # ... logic here ...

    def upload_to_cloud(self, file_path):
        # ... logic here ...

Key Challenges & Solutions
#

Error Handling

What if the API is down?

Fix: Used try/except blocks to catch failures.

try:
      response = requests.get(...)
      response.raise_for_status()  # Crash if API call fails
  except requests.exceptions.RequestException as e:
      print(f"API Error: {e}")

AWS Permissions
- Fix: Configured IAM roles in AWS to grant S3 access.
Data Clutter
- Fix: Used extract_relevant_data() to keep only what’s needed.

Schedule the script to run daily (e.g., with AWS Lambda).
Add a dashboard to visualize weather trends.
Expand cities or integrate more APIs (e.g., weather forecasts).

Lessons for Beginners
#

Start small. Break projects into tiny tasks.
Secure secrets. Never commit API keys to GitHub!
Embrace functions. They keep code organized and reusable.

Happy coding! 🌟 Whether you’re automating weather data or building the next Netflix, remember: every big project starts with a single line of code.

The full project code and setup script are available on GitHub: Weather-Dashboard

Reply by Email

What I Built #

Architecture Diagram #

How I Approached It #

1. Authentication & Setup #

2. Fetch Data from OpenWeather #

3. Process the Data #

4. Save Locally #

5. Upload to AWS S3 #

Why I Chose Procedural Programming (Not OOP) #

1. Simplicity #

2. Scope #

3. Faster Prototyping #

When Would I Use OOP? #

Key Challenges & Solutions #

Lessons for Beginners #