Using AI to generate synthetic data
Day 9 is all about AI and my thought here was to come up with some prompts to help me generate synthetic weather data and then process that through the Mapbox Tiling Service to generate a rasterarray tileset. For my choice of AI partner in this challenge, I went with Claude 3.5 Sonnet. I also wanted to stay on theme of Colorado so here's the prompt that I gave it.
Can you write a script that would generate a netCDF file including the u and v wind components of a synthetic hurricane over the mountains of Colorado? Can you make sure it includes at least 30 timestamps.
The script it generated only had 2 python dependencies I needed, netCDF4
and numpy
. I went ahead and created a poetry project, installed those two dependencies, and then ran the script that Claude gave me. Amazingly it generated a proper netCDF file with wind vectors over the entire area of Colorado! I processed the generated file through the Mapbox Tiling service, but the data wasn't exactly how I had imagined it looking in my head so I prompted Claude to make the data higher resolution, increase the number of timestamps, and move the hurricane it was generating to be more central over the state of Colorado. This is the final script it generated and without any modifications on my part, it just runs.
import numpy as np
from netCDF4 import Dataset
from datetime import datetime
import math
# Set up grid parameters
lat_min, lat_max = 37.0, 41.0 # Colorado latitude range
lon_min, lon_max = -109.0, -102.0 # Colorado longitude range
grid_spacing = 0.025 # degrees
num_timestamps = 72 # 72 half-hourly timestamps
# Create coordinate arrays
lats = np.arange(lat_min, lat_max, grid_spacing)
lons = np.arange(lon_min, lon_max, grid_spacing)
times = np.arange(num_timestamps)
# Create output netCDF file
nc_file = Dataset(
"colorado_hurricane_ultra_high_res_tighter_fewer_timestamps_adjusted.nc",
"w",
format="NETCDF4",
)
# Create dimensions
nc_file.createDimension("time", num_timestamps)
nc_file.createDimension("latitude", len(lats))
nc_file.createDimension("longitude", len(lons))
# Create variables
times_var = nc_file.createVariable("time", "f8", ("time",))
lats_var = nc_file.createVariable("latitude", "f4", ("latitude",))
lons_var = nc_file.createVariable("longitude", "f4", ("longitude",))
u_wind = nc_file.createVariable(
"u_wind",
"f4",
(
"time",
"latitude",
"longitude",
),
)
v_wind = nc_file.createVariable(
"v_wind",
"f4",
(
"time",
"latitude",
"longitude",
),
)
# Add attributes
nc_file.description = "Ultra high-resolution synthetic hurricane wind field over Colorado (tighter structure, fewer timestamps, adjusted track)"
nc_file.history = f'Created {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}'
times_var.units = "hours since 2024-07-01 00:00:00"
times_var.calendar = "gregorian"
lats_var.units = "degrees_north"
lons_var.units = "degrees_east"
u_wind.units = "meters_per_second"
v_wind.units = "meters_per_second"
# Write coordinate data
lats_var[:] = lats
lons_var[:] = lons
times_var[:] = times
# Function to create hurricane wind field
def generate_hurricane_winds(center_lat, center_lon, max_wind_speed, radius_max_winds):
wind_field_u = np.zeros((len(lats), len(lons)))
wind_field_v = np.zeros((len(lats), len(lons)))
for i, lat in enumerate(lats):
for j, lon in enumerate(lons):
# Calculate distance from hurricane center
dx = (lon - center_lon) * 111000 * np.cos(np.radians(lat)) # meters
dy = (lat - center_lat) * 111000 # meters
r = np.sqrt(dx**2 + dy**2)
# Tighter Rankine vortex model
if r < radius_max_winds:
v_tangential = max_wind_speed * (r / radius_max_winds) ** 2
else:
v_tangential = max_wind_speed * (radius_max_winds / r)
# Convert tangential wind to u and v components
if r > 0:
wind_field_u[i, j] = -v_tangential * dy / r
wind_field_v[i, j] = v_tangential * dx / r
return wind_field_u, wind_field_v
# Generate hurricane track (moving northeast across Colorado, starting more southerly)
start_lat = 36.5
start_lon = -107.0
track_angle = 50 # degrees (more northeasterly)
movement_speed = 0.075 # degrees per timestep (faster initial movement)
# Generate wind fields for each timestamp
for t in range(num_timestamps):
# Update hurricane center position
center_lat = start_lat + t * movement_speed * np.sin(np.radians(track_angle))
center_lon = start_lon + t * movement_speed * np.cos(np.radians(track_angle))
# Vary hurricane intensity over time
max_wind_speed = 50 + 20 * np.sin(2 * np.pi * t / num_timestamps) # m/s
radius_max_winds = 30000 + 5000 * np.sin(2 * np.pi * t / num_timestamps) # meters
# Generate wind field
u_field, v_field = generate_hurricane_winds(
center_lat, center_lon, max_wind_speed, radius_max_winds
)
# Add terrain-induced modifications (simplified)
terrain_effect = np.exp(
-((lats[:, np.newaxis] - 39.5) ** 2) / 0.5
) # Approximate effect of mountains
u_field *= terrain_effect
v_field *= terrain_effect
# Write to netCDF file
u_wind[t, :, :] = u_field
v_wind[t, :, :] = v_field
# Close the netCDF file
nc_file.close()
With the netCDF file generated, here were the next steps I needed to follow in order to get the tileset to render like you see above.
- Upload the file as a source into the Mapbox Tiling Service.
- Create a recipe (seen below) that would target the u and v wind components in the file.
- Create a tileset using that recipe and publish it to the Mapbox Tiling Service.
With the tileset generated, it was as simple as just referencing it as a source in my map and adding a layer to render it.
{
"version": 1,
"type": "rasterarray",
"sources": [
{
"uri": "mapbox://tileset-source/{YOUR_ACCOUNT_NAME}/{SOURCE_NAME}"
}
],
"layers": {
"winds": {
"tilesize": 256,
"resampling": "nearest",
"buffer": 1,
"source_rules": {
"filter": [
["==", ["get", "NETCDF_VARNAME"], "u_wind"],
["==", ["get", "NETCDF_VARNAME"], "v_wind"]
],
"name": ["to-number", ["get", "NETCDF_DIM_time"]],
"order": "asc"
}
}
}
}
Getting a tileset like this working is explained in more detail within the Visualize wind data guide from Mapbox.
Throwing in the towel on the 30 day challenge
At the start of this challenge I was feeling quite ambitious and stoked, but we're now two weeks into November and I'm slipping on the daily posts. It's been hard to keep up, especially with the newborn. Although I'm ending it early, I'm still proud of all the work I've done in this challenge and definitely learned a lot. I might still do a day or two here in the remainder of this month if I find the time, or not, we'll see. In the meantime, happy mapping!