Rendering note: The worked examples (in repo EXAMPLES.md) use Mermaid diagrams and LaTeX math blocks. They render natively on GitHub and in VS Code with the appropriate extensions. If diagrams or equations are not rendering correctly, view the files on GitHub.

A municipal water utility operates unmanned pumping stations across the region. Many of these sites have unreliable or expensive connectivity. When a pump fails unexpectedly, the cost is high: emergency dispatches, water service disruptions, and potential equipment damage that could have been caught early.

The utility wants to deploy small AI-capable devices (NVIDIA Jetson) at each station to monitor pump health using sensor telemetry: vibration, motor temperature, and motor current. The system must detect early signs of problems locally and send only alerts and summaries to the cloud, not raw data streams.

Your dataset captures 30 minutes of pump operation at 50 Hz, including three realistic failure scenarios:

  • Cavitation bursts (360 - 540s): air pockets form in the pump impeller, causing intermittent vibration spikes. If uncaught, cavitation erodes impeller blades over weeks.
  • Bearing wear ramp (840 - 1200s): a gradually worsening bearing increases vibration and current draw. Early detection avoids a catastrophic bearing seizure.
  • Overheat step (1440 - 1620s): motor temperature rises suddenly, perhaps from a blocked cooling vent or overload. This can trigger thermal shutdown within minutes.

In this assignment, you will build a normal-only anomaly detector for the PumpWatch system using a tiny autoencoder trained on healthy operation data. You will implement windowed feature extraction, train the model, choose a detection threshold, and produce a plot that shows when your detector flags each failure event.


Objective and Expected Learning Outcomes

By completing this assignment, you will be able to:

  • Extract windowed features (mean and standard deviation) from time-series sensor data.
  • Train a tiny autoencoder on normal-only data using PyTorch.
  • Use reconstruction error (MSE) as an anomaly score.
  • Select a detection threshold using percentiles of training scores.
  • Produce a clear anomaly detection plot with ground truth comparison.
  • Explain detection timing, false positives, and false negatives.

What You Are Given

Your repository includes:

  • data/pumpWatchTelemetry_30min_50Hz.csv – 30 minutes of pump telemetry at 50 Hz
  • data/pumpWatchTelemetry_manifest.json – ground truth anomaly intervals and dataset metadata
  • scripts/detectAnomalies.py – starter code with 4 TODOs
  • requirements.txt – Python dependencies
  • reflection.txt – reflection questions to answer after running

Rules

  • Follow course guidelines (on the course website) about working on the development branch.
  • Keep commits small and meaningful, and ensure your pull request description clearly summarizes the changes and intent.
  • Do not commit .venv or any virtual environment directory.

Setup

Create a virtual environment and install dependencies:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Dataset

The telemetry CSV contains 50 Hz samples over 30 minutes (~90,000 rows) with three sensor channels: vibration, motor temperature proxy, and motor current proxy.

The manifest defines three anomaly events:

Event Start (s) End (s) Pattern
Cavitation bursts 360 540 Intermittent (2s on, 6s off)
Bearing wear ramp 840 1200 Gradual drift
Overheat step 1440 1620 Sudden step

The recommended training window is 0 to 330 seconds (normal-only, before first anomaly).


Step-by-Step Walkthrough

Part A: Feature Extraction (TODO 1)

Open scripts/detectAnomalies.py and implement windowFeatures().

Each window is a (windowSize, 3) array of sensor values. Compute:

  • Mean of each channel (3 values)
  • Standard deviation of each channel (3 values)

This produces a 6-dimensional feature vector per window.

meanVals = window.mean(axis=0)
stdVals = window.std(axis=0) + 1e-6
feats = np.concatenate([meanVals, stdVals], axis=0).astype(np.float32)

Concept Diagram

flowchart LR
    A["Sliding Window<br/>(windowSize × 3)<br/>Raw Sensor Data"]
    B["vibration<br/>Mean + Std"]
    C["motorTempProxy<br/>Mean + Std"]
    D["motorCurrentProxy<br/>Mean + Std"]
    E["Feature Vector<br/>(6 values)<br/>[μ1, μ2, μ3, σ1, σ2, σ3]"]

    A --> B
    A --> C
    A --> D

    B --> E
    C --> E
    D --> E

Numerical Example

Assume windowSize = 4 and 3 sensor channels:

  1. vibration
  2. motorTempProxy
  3. motorCurrentProxy

So one window has the shape (4, 3).

Fake Sensor Window

time step vibration motorTempProxy motorCurrentProxy
1 1.0 10.0 100.0
2 2.0 12.0 110.0
3 3.0 11.0 90.0
4 4.0 13.0 95.0

As a NumPy array:

window = np.array([
    [1.0, 10.0, 100.0],
    [2.0, 12.0, 110.0],
    [3.0, 11.0,  90.0],
    [4.0, 13.0,  95.0],
], dtype=np.float32)

Step 1: Mean of Each Channel

Compute the mean across the time dimension (axis=0).

  • Vibration mean:
\[\mu_1 = (1 + 2 + 3 + 4) / 4 = 2.5\]
  • Temperature mean:
\[\mu_2 = (10 + 12 + 11 + 13) / 4 = 11.5\]
  • Current mean:
\[\mu_3 = (100 + 110 + 90 + 95) / 4 = 98.75\]
meanVals = [2.5, 11.5, 98.75]

Step 2: Standard Deviation of Each Channel

NumPy uses population standard deviation by default (ddof=0):

\[\sigma = \sqrt{\frac{1}{N} \sum (x_i - \mu)^2}\]
  • Vibration standard deviation:
\[\sigma_1 = \sqrt{1.25} \approx 1.1180\]
  • Temperature standard deviation:
\[\sigma_2 = \sqrt{1.25} \approx 1.1180\]
  • Current standard deviation:
\[\sigma_3 = \sqrt{54.6875} \approx 7.3951\]
stdVals = [1.1180, 1.1180, 7.3951]

Step 3: Final 6D Feature Vector

The final feature vector is the concatenation:

[ mean(vibration), mean(temp), mean(current),
  std(vibration),  std(temp),  std(current) ]

Numerically:

features =
[2.5, 11.5, 98.75, 1.1180, 1.1180, 7.3951]

Corresponding Code

meanVals = window.mean(axis=0)
stdVals = window.std(axis=0) + 1e-6
feats = np.concatenate([meanVals, stdVals], axis=0).astype(np.float32)

Common Mistakes to Avoid

  • Using axis=1 instead of axis=0. axis=0 computes the mean/std across time steps (down each column). axis=1 would compute across the 3 channels per time step, which is not what you want.

  • Forgetting to concatenate mean and std into a single vector. The output must be a 1D array of length 6, not two separate arrays or a (2, 3) matrix.

  • Omitting + 1e-6 on the standard deviation. Without epsilon, a perfectly constant channel produces std = 0, which causes division-by-zero during normalization in TODO 2.


Part B: Training Data Selection and Normalization (TODO 2)

Select normal-only windows from the pre-anomaly period and normalize features using statistics computed from training windows only.

Training windows must satisfy both:

  • tSec <= trainEndSec
  • isAnomaly == 0

Then:

  1. Build trainFeat from those windows only
  2. Compute mu and sigma from trainFeat
  3. Normalize training windows and all windows using:
trainXn = (trainFeat - mu) / (sigma + 1e-6)
allXn   = (allFeat - mu) / (sigma + 1e-6)

Concept Diagram

flowchart LR
    A["All windows + features<br/>(tSec, isAnomaly, 6D feats)"]
    B["Filter: tSec <= trainEndSec"]
    C["Filter: isAnomaly == 0"]
    D["Training feature matrix<br/>(Ntrain × 6)"]
    E["Compute mu, sigma"]
    F["Normalize ALL windows<br/>(training stats only)"]

    A --> B --> C --> D --> E --> F

Tiny Example: Filtering Training Windows

Imagine your per-window DataFrame looks like this:

windowId tSec isAnomaly
0 10 0
1 20 0
2 340 0
3 380 1

If trainEndSec = 330, then training rows must satisfy:

  • tSec <= 330 and isAnomaly == 0

So windows 0 and 1 are training windows. Windows 2 and 3 are excluded from training.


Tiny Example: Normalization Numbers

Suppose training windows have 6D features:

trainFeat =
[ 2.0, 10.0, 100.0, 1.0, 0.5, 5.0 ]
[ 4.0, 14.0, 120.0, 3.0, 1.5, 7.0 ]

Compute per-feature mean and std:

  • mu = [3.0, 12.0, 110.0, 2.0, 1.0, 6.0]
  • sigma = [1.0, 2.0, 10.0, 1.0, 0.5, 1.0]

Normalize a sample window:

x = [4.0, 10.0, 100.0, 1.0, 1.5, 5.0]

xNorm = (x - mu) / sigma
      = [ 1, -1, -1, -1, 1, -1 ]

Common Mistakes to Avoid

  • Do not compute normalization stats from the whole dataset. That leaks anomaly information into training.

  • Always add epsilon when dividing:

    • sigma = sigma + 1e-6

Part C: Threshold Selection via Training Percentile (TODO 3)

After training, compute reconstruction MSE for training windows. Use a high percentile of training scores as the anomaly threshold.

threshold = float(np.percentile(trainMse, cfg.percentile))
predicted = (allMse > threshold).astype(np.int32)

Concept Diagram

flowchart LR
    A["Training windows (normal only)"]
    B["Autoencoder reconstructs features"]
    C["Compute MSE per window<br/>(trainScores)"]
    D["threshold = percentile(trainScores, 99.5)"]
    E["Predict anomaly if score > threshold"]

    A --> B --> C --> D --> E

Numerical Example: Percentile Threshold

Suppose your training reconstruction errors are:

trainScores = [0.08, 0.10, 0.11, 0.09, 0.12, 0.10, 0.13, 0.09]

A 99.5th percentile threshold will be near the maximum value, meaning:

  • Most normal windows stay below the threshold
  • Only very unusual behavior gets flagged

In code:

threshold = float(np.percentile(trainMse, cfg.percentile))

Prediction rule:

predicted = (allMse > threshold).astype(np.int32)

Why This Works

Because training is normal-only, the training score distribution represents:

“How weird normal can look.”

A high percentile says:

“Flag something only if it is worse than almost all normal windows.”

Common Mistakes to Avoid

  • Do not compute the threshold from all windows. It must come from training windows only.

  • Ensure you use the same MSE definition for training and full scoring.


Part D: Save Outputs and Generate Plot (TODO 4)

Save the per-window anomaly scores to CSV and generate the final plot.

The starter code builds outDf with these columns:

Column Description
tSec Window center time
mse Reconstruction MSE (anomaly score)
threshold Detection threshold (same value for every row)
predictedAnomaly 0 or 1
isAnomaly Ground truth label

Save it with outDf.to_csv(cfg.outputCsv, index=False) and call makePlot(outDf, manifest, cfg.outputPlot).

Concept Diagram

flowchart LR
    A["All windows + normalized features"]
    B["Autoencoder reconstruction"]
    C["Score each window (MSE)"]
    D["predicted = (mse > threshold)"]
    E["Save logs/anomalyScores.csv"]
    F["makePlot() -> logs/anomalyDetection.png"]

    A --> B --> C --> D --> E
    D --> F

What Your Output CSV Should Contain

The starter code builds outDf with these columns:

Column Description
tSec Window center time
mse Reconstruction MSE (anomaly score)
threshold Detection threshold (same value for every row)
predictedAnomaly 0 or 1
isAnomaly Ground truth label

Example rows:

tSec mse threshold predictedAnomaly isAnomaly
100.0 0.11 0.30 0 0
361.0 0.80 0.30 1 1
900.0 0.35 0.30 1 1

Tiny Example: Turning Scores into Predictions

If:

  • mse = 0.42
  • threshold = 0.30

Then:

  • predictedAnomaly = 1 because 0.42 > 0.30

That single rule is the entire anomaly detector.

Common Mistakes to Avoid

  • Forgetting to create the output directory (logs/)
  • Saving a CSV without tSec (making plots hard to interpret)
  • Generating the plot but not saving it to the required path
  • Never calling makePlot() in the script

Running

mkdir -p logs

python3 scripts/detectAnomalies.py \
  --csv data/pumpWatchTelemetry_30min_50Hz.csv \
  --manifest data/pumpWatchTelemetry_manifest.json \
  --outputCsv logs/anomalyScores.csv \
  --outputPlot logs/anomalyDetection.png

What Your Plot Must Show

Your anomaly detection plot must include:

  • Anomaly score (reconstruction MSE) over time
  • Threshold line
  • Predicted anomaly regions are clearly marked
  • Ground truth anomaly intervals from the manifest

A viewer must be able to answer: what happened, when it started, and when your detector flagged it.


Submission

Verify Required Files

Before committing, verify that your script produced the required outputs:

ls -lh logs/
ls -lh scripts/detectAnomalies.py
ls -lh reflection.txt
ls -lh requirements.txt

Required files:

  • logs/anomalyScores.csv
  • logs/anomalyDetection.png
  • scripts/detectAnomalies.py
  • reflection.txt

On development Branch: Commit and Push

All work for this lab must be committed to your development branch, not directly on main.

  1. Verify you are on the development branch:
git branch

If needed:

git checkout development
  1. Stage your completed files:
git add scripts/detectAnomalies.py
git add reflection.txt
git add logs/anomalyScores.csv logs/anomalyDetection.png
  1. Commit your changes once the lab is complete:
git commit -m "Complete PumpWatch anomaly detection."
  1. Push your development branch to GitHub:
git push origin development
  1. Open a pull request from development to main on GitHub.
    This pull request is your official submission and must follow the course assignment guidelines.

Do not merge the pull request yourself unless explicitly instructed.


Evaluation Criteria

You receive full credit if:

  1. All 4 TODOs are correctly implemented.
  2. The script runs without errors and produces valid outputs.
  3. Your detection plot clearly shows scores, threshold, predictions, and ground truth.
  4. Reflection answers include specific numbers and timestamps.
  5. Your work is committed correctly, and .venv is not committed.

Optional: Try a Second Dataset

Once your detector works on the primary dataset, you can test it on a second telemetry file that contains different failure types. No code changes are needed — just swap the --csv and --manifest arguments:

python3 scripts/detectAnomalies.py \
  --csv data/pumpWatchTelemetry2_30min_50Hz.csv \
  --manifest data/pumpWatchTelemetry2_manifest.json \
  --outputCsv logs/anomalyScores2.csv \
  --outputPlot logs/anomalyDetection2.png

This dataset has the same format (30 min, 50 Hz, same sensor channels) but simulates three different pump problems:

Event Start (s) End (s) Pattern
Impeller imbalance 360 540 Harmonic vibration ramp-up then sustained
Seal degradation 780 1020 Gradual current drop + vibration/temp rise
Dry run 1320 1500 Sharp temp spike + erratic vibration + current drop

Your detector retrains on the new dataset’s normal period and evaluates against its anomalies. Compare how well it detects these different failure modes versus the original three — do some patterns produce higher scores? Are any harder to catch?

Additional Resources