Have you ever wondered whether solar power is worth it even though the place you live in might not get that much sun? What about in Canada where we get snow half of the year. These are questions we wanted to answer at the biggest Hackathon of North America: Hack The North 2022.

On Paper Calculation

Although solar energy generators typically operate at 75% efficiency on average, this figure can vary depending on factors such as regional weather conditions. To address this variability, we analyzed data from scientific literature, specifically examining the impact of environmental variables like temperature on energy generation. Through this analysis, we quantitatively assessed the relative influence of each factor and integrated this weighted average into a straightforward cost-per-kWh metric for user reference.

Here is the calculation implementation in Python:

tempScore = calcTemperatureScore(data['daily']['temperature_2m_min'], data['daily']['temperature_2m_max'])
humidityScore = calcHumidityScore(data['daily']['temperature_2m_min'], data['daily']['apparent_temperature_min'], data['daily']['temperature_2m_max'], data['daily']['apparent_temperature_max'])
windScore = calcWindScore(data['daily']['windspeed_10m_max'], data['daily']['windgusts_10m_max'])
solarScore = 0.3*(tempScore) + 0.6*(humidityScore) + 0.1*(windScore)
turbineScore = 0.1*(tempScore) + 0.3*(humidityScore) + 0.6*(windScore)

Stack

  • For choice of tech, we used Flask & CockroachDB for the backend, and Next.JS & React for the front-end. For the Map we used Mapbox. FOr the global weather data we used https://open-meteo.com/.

Demo

Post-hackathon re-write

The initial tech stack we used was chosen as a result of what the person working on the backend aspect was familiar with. After the hackathon, I wanted to re-write the backend in Java Spring and a couple other technologies ALSO as an exercise in both scaling and performance/load testing.

Software Design Patterns and Object-Oriented Design:

This is Java so I went with some opiniated design patterns that often keep the code maintainable, testable, and modular even if its for some extra boilerplate code.

Factory method pattern:

  • I used this pattern to encapsulate the creation of complex objects, such as database connections and service instances. This allowed for flexibility in object instantiation and facilitated dependency injection.
// Factory method pattern example

public interface DatabaseConnection {
    void connect();
    void disconnect();
}

public class MySQLDatabaseConnection implements DatabaseConnection {
    @Override
    public void connect() {
        // Connect to MySQL database
    }

    @Override
    public void disconnect() {
        // Disconnect from MySQL database
    }
}

public class DatabaseConnectionFactory {
    public static DatabaseConnection createDatabaseConnection(String type) {
        if ("MySQL".equals(type)) {
            return new MySQLDatabaseConnection();
        }
        // Handle other database types
        return null;
    }
}

Dependency Injection

  • I used this to loosen coupling between components and facilitate inversion of control. By injecting dependencies rather than instantiating them directly within classes, the code is much more modular.

Here is an example where I used Spring’s Autowired feature for dependency Injection:

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

@Service
public class AltimaService {

    private final WeatherService weatherService;

    @Autowired
    public AltimaService(WeatherService weatherService) {
        this.weatherService = weatherService;
    }

    public String getAltimaForecast() {
        // Fetch weather data using WeatherService
        WeatherData weatherData = weatherService.getWeatherData();
        
        // Process weather data to generate Altima forecast
        String altimaForecast = "Today's Altima forecast: " + weatherData.getTemperature() + "°C, " + weatherData.getWindSpeed() + "m/s";

        return altimaForecast;
    }
}
  • @Autowired annotation is used to inject an instance of WeatherService into the AltimaService constructor, allowing AltimaService to access weather data.
  • With dependency injection, AltimaService is decoupled from the specific implementation of WeatherService, making it more flexible and easier to test.

Architecture

In terms of architecture, there is a few constraints.

  1. Calculations are very costly and time consuming with the API call alone taking about 1 second, which means our calculations and sifting through needs to be very quick.

    • For this, we will test performance first then optmize the bottleneck until we meet sub 1.5 second end to end load times.
    • Optimization could include pre-calculating averages for popular locations for users beforehand or even de-normalization of a database.
  2. Because of the business context of this problem, users are expected to go back and compare the projections on Altima and other websites making a lot of redundent calls.

    • For this, we can use a cache and for that I will be using memcache

Cloud

The last requirement is just for practice, which is making Altima scale even though I am not expecting a lot of users!

In terms of which cloud, I went with Azure because they give students a 300$ cloud credit. Here is how the system looks like:

alt text

Azure App Service: Provides a managed platform for hosting web applications, ensuring high availability, auto-scaling, and easy deployment.

More importantly, it natively supports Spring Boot Apps and with Maven! Here is my main reference https://learn.microsoft.com/en-us/training/modules/deploy-java-spring-boot-app-service-mysql/ .

Horizontal Scaling

The nice thing about Azure Cloud is we can have more instances of our Spring Boot App the more HTTP traffic it gets. The “App Service Monitor” automatically starts up more instances to meet this demand! This is called horizintal scaling, where we add more instances instead of adding more computing power to our existing one. Read more at https://learn.microsoft.com/en-us/azure/app-service/manage-scale-up

Even though this is a cloud feature, its a very fundamental practice similar to load balancing:

alt text

Downsides

There is no perfect system, its always tradeoffs.

  • Network overhead from having a lot of instances
  • Concurrent load on the DB from having a lot of instances talking to it These downsides are acceptable to us in this scenario because we dont expect a lot of users and network overhead is not large enough for performance losses to be visible.

Code that scales with Java

No matter how much we optimize our system, if our code is inefficient, the product will not be performant.

Leveraging Concurrency & Spring Asynchronous Processing

Concurrency is crucial for handling multiple concurrent requests efficiently. By utilizing Spring’s asynchronous processing capabilities, we offload long-running tasks to background threads, ensuring that our application remains responsive and can handle high request loads without blocking.

The @Async annotation in Spring allows methods to be executed asynchronously, improving overall system throughput and responsiveness. This approach ensures that our application can efficiently utilize available resources and handle concurrent requests without degradation in performance.

Here is an example:

// AsyncService.java

import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;

@Service
public class AsyncService {

    @Async
    public void performAsyncTask() {
        // Perform asynchronous task here
    }
}

REST Route Example

We dont have to think about threading and scaling in all aspects of the application though because Spring MVC is used, here is an example route:

// ProfileController.java

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/profiles")
public class ProfileController {

    @Autowired
    private ProfileService profileService;

    @GetMapping("/{id}")
    public ResponseEntity<Profile> getProfileById(@PathVariable("id") Long id) {
        Profile profile = profileService.getProfileById(id);
        if (profile != null) {
            return new ResponseEntity<>(profile, HttpStatus.OK);
        } else {
            return new ResponseEntity<>(HttpStatus.NOT_FOUND);
        }
    }

    @PostMapping
    public ResponseEntity<Profile> createProfile(@RequestBody Profile profile) {
        Profile createdProfile = profileService.createProfile(profile);
        return new ResponseEntity<>(createdProfile, HttpStatus.CREATED);
    }

    @PutMapping("/{id}")
    public ResponseEntity<Profile> updateProfile(@PathVariable("id") Long id, @RequestBody Profile profile) {
        Profile updatedProfile = profileService.updateProfile(id, profile);
        if (updatedProfile != null) {
            return new ResponseEntity<>(updatedProfile, HttpStatus.OK);
        } else {
            return new ResponseEntity<>(HttpStatus.NOT_FOUND);
        }
    }

    @DeleteMapping("/{id}")
    public ResponseEntity<Void> deleteProfile(@PathVariable("id") Long id) {
        profileService.deleteProfile(id);
        return new ResponseEntity<>(HttpStatus.NO_CONTENT);
    }
}

Cloud Costs

Cloud costs are hard to predict but we can do some basic napkin math.

alt text From https://sirupsen.com/napkin

  1. Azure VM for Memcached and MySQL:

    • Standard D2s v3 (2 vCPUs, 8 GB RAM) instance: $120/month
  2. Azure App Service (3 instances):

    • Estimated cost per instance: $50/month

Total estimated cost for 3 instances: $150/month and that is if we need 3 instances which is not probable.

Deployment

To deploy the application to Azure we have to manually create the resource group and services but this means everytime we want to redploy our application its not very repeatable (which is important for testing and reliability) and also locks us in with Azure. This is where Terraform comes in.

Here is a snapshot of the Terraform file:

# Resource group creation
resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "East US"
}

# Virtual Network creation
resource "azurerm_virtual_network" "example" {
  name                = "example-vnet"
  resource_group_name = azurerm_resource_group.example.name
  address_space       = ["10.0.0.0/16"]
}

# Subnet creation
resource "azurerm_subnet" "example" {
  name                 = "example-subnet"
  resource_group_name  = azurerm_resource_group.example.name
  virtual_network_name = azurerm_virtual_network.example.name
  address_prefixes     = ["10.0.1.0/24"]
}

# Public IP creation for MySQL instance
resource "azurerm_public_ip" "mysql" {
  name                = "mysql-public-ip"
  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location
  allocation_method   = "Static"
  sku                 = "Basic"
}

# Network security group for MySQL
resource "azurerm_network_security_group" "mysql" {
  name                = "mysql-nsg"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name

  security_rule {
    name                       = "allow-mysql"
    priority                   = 1001
    direction                  = "Inbound"
    access                     = "Allow"
    protocol                   = "TCP"
    source_port_range          = "*"
    destination_port_range     = "3306"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
  }
}

# Virtual machine for MySQL
resource "azurerm_virtual_machine" "mysql" {
  name                  = "mysql-vm"
  location              = azurerm_resource_group.example.location
  resource_group_name   = azurerm_resource_group.example.name
  network_interface_ids = [azurerm_network_interface.mysql.id]
  vm_size               = "Standard_DS2_v2"

  storage_os_disk {
    name              = "mysql-osdisk"
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Standard_LRS"
  }

  os_profile {
    computer_name  = "mysql"
    admin_username = "adminuser"
    admin_password = "StrongPassword123!"
  }

  os_profile_linux_config {
    disable_password_authentication = false
  }

  tags = {
    environment = "mysql"
  }
}
  • You can see how its like a config file for cloud. It makes deployment very repeatable.

This leaves us with the MySQL + memcache VM to deploy. For that we can use Kubernetes and any VM / computer to run container instances of it. Here is how the YAML Kubernetes file looks like:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: app-service
  template:
    metadata:
      labels:
        app: app-service
    spec:
      containers:
      - name: app-service
        image: your-registry/app-service:latest
        env:
        - name: MEMCACHE_HOST
          value: memcache-mysql-external-ip  # Replace with the external IP of your combined Memcached/MySQL VM instance
        - name: MYSQL_HOST
          value: memcache-mysql-external-ip  # Replace with the external IP of your combined Memcached/MySQL VM instance
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: app-service
spec:
  selector:
    app: app-service
  ports:
  - protocol: TCP
    port: 8080
    targetPort: 8080

Load Testing

Here is the fun part. We get to test our application. I wrote a Python script for this. Here is the code.

from locust import HttpUser, between, task

wait_time = between(5, 15)
MIN_USERS = 10
MAX_USERS = 20000


@task
def query(lng, lat):
    response = client.get(f'/query?lng={lng}&lat={lat}')
    print(f"Query response status code: {response.status_code}")

@task
def save(user, lng, lat):
    response = client.post(f'/save?user={user}&lng={lng}&lat={lat}')
    print(f"Save response status code: {response.status_code}")

@task
def get_requests(user):
    response = client.get(f'/get/requests?user={user}')
    print(f"Get requests response status code: {response.status_code}")

In this script:

wait_time defines the wait time between tasks. The query, save, and get_requests functions simulate different user actions. Each function sends an HTTP request to the corresponding route and prints the status code of the response

We run this code with locust -f load_test_ramp_up.py --headless -u 10 -r 1 --run-time 1m to test 10 users to start but it slowly ramps up to the MAX users variable which is 15k

Here is the output:

[2024-03-20 10:00:00] Loadtest starting
[2024-03-20 10:00:05] GET /query?lng=-80.516670&lat=43.466667 (response time: 1521ms, response code: 200)
[2024-03-20 10:00:10] POST /save?user=example_user&lng=-80.516670&lat=43.466667 (response time: 400ms, response code: 200)
[2024-03-20 10:00:15] GET /get/requests?user=example_user (response time: 182ms, response code: 200)
[2024-03-20 10:00:20] GET /query?lng=-80.516670&lat=43.466667 (response time: 163ms, response code: 200)
[2024-03-20 10:00:25] POST /save?user=example_user&lng=-80.516670&lat=43.466667 (response time: 191ms, response code: 200)
[2024-03-20 10:00:30] GET /get/requests?user=example_user (response time: 178ms, response code: 200)
[2024-03-20 10:00:35] GET /query?lng=-80.516670&lat=43.466667 (response time: 155ms, response code: 200)

Overall looking very nice and you can see as we make the same requests the response times are much faster because of our cache. The response time also stayed under a second even when it ramped up to 15k users.

When I did not use Kubernetes, I saw around 16k users the response times started to go above a second on average, WHICH IS VERY COOL because the first bottleneck we predicted which is the MySQL DB instance was the limit:

alt text

Hope you found the project as interesting as I did!

Links

Source Code: https://github.com/HikaruSadashi/Altima

📝Hackathon Submission: https://devpost.com/software/altima