Guide: Optimizing the Step Function

Related issue: #74

Context

Every time household.step() is called, the function iterates over all stores multiple times per household. With thousands of households and hundreds of stores, this adds up. The goal is to consolidate into fewer passes without changing results.

Before You Start

Steps

1. Audit the current store iterations

In a single call to step(), a household iterates all stores in:

  1. calculate_distances() - builds distances_map (all stores)
  2. get_closest_spm() - iterates all stores, filters supermarkets
  3. get_closest_cspm() (called inside get_mfai()) - iterates all stores, filters non-supermarkets
  4. stores_with_1_miles() - iterates all stores via distances_map

That’s at least 4 full passes over the store list per household per step.

2. Write a spec BEFORE coding

Post this in the issue comments:

“I propose consolidating store iteration into a single pass that computes:

This reduces from O(4*S) to O(S) per household, where S = number of stores. The get_closest_spm(), get_closest_cspm(), and stores_with_1_miles() functions either get merged into the single pass or rewritten to read from pre-computed values.”

3. Write tests for CURRENT behavior first

Before changing anything, capture the current behavior in tests:

These tests must still pass after your optimization.

4. Implement the optimization

Write a single method (e.g., _compute_store_metrics()) that iterates stores once and computes all needed values. Then update step() to call it.

5. Verify identical results

Run your before-tests. They should pass with unchanged results.

LLM Usage

Definition of Done

Stretch Goals