# Predictive Maintenance for Lab Robotics on AWS: IoT SiteWise, Greengrass, and the Sensor Pipeline

I laid out why pharma lab automation needs its own predictive-maintenance data layer [in the companion piece](pharma-lab-robotics-automation-maintenance) — the GxP audit-trail burden, the irrecoverable-sample problem, the trap of automating result capture before equipment telemetry. I won't re-run that argument here. This is the AWS architecture I'd actually build for it, and the one service in this stack that surprised me with how well-suited it is: AWS IoT SiteWise.

## Why is AWS IoT SiteWise the right fit for lab-equipment telemetry?

**AWS IoT SiteWise** is AWS's industrial-equipment data modeling service, purpose-built to represent physical assets — machines, in the original industrial use case — as structured hierarchies of properties and measurements, with the ability to compute derived metrics (rolling averages, trends, thresholds) directly on the ingested data rather than requiring a separate analytics job for every simple aggregation. A generic time-series database will happily store a stream of numbers tagged with a timestamp; SiteWise goes further by modeling the asset itself: a liquid-handling robot becomes an asset with defined properties (pipetting cycle count, calibration deviation, motor vibration RMS), and you can define derived metrics like a 7-day rolling average of calibration drift or a cycle-count-since-last-service counter as first-class computed properties on that asset, queryable without you standing up a separate stream-processing job to compute them.

This maps onto lab equipment more directly than it might first appear. A liquid handler, a plate reader, a robotic transport arm — these are exactly the kind of discrete, property-bearing physical assets SiteWise's asset-model abstraction was built to represent, even though the service's marketing has historically leaned industrial/manufacturing. The asset hierarchy also composes naturally with how a lab is actually organized: an asset model for "liquid handler," instantiated once per physical unit across multiple lab sites, rolled up into a site-level or fleet-level view for whoever's managing maintenance across the whole R&D organization rather than one instrument at a time.

```mermaid
graph TD
    INST["Lab instrumentliquid handler, plate reader, robotic arm"] --> GG["AWS IoT Greengrassedge protocol bridge"]
    GG -->|"normalized telemetry"| SW["AWS IoT SiteWiseasset model + derived metrics"]
    SW --> SM["SageMakeranomaly detection / RUL model"]
    SM -->|"maintenance alert"| SCHED["Maintenance scheduling"]
    SW --> S3["S3 with Object Lockimmutable audit trail"]
    S3 --> CT["CloudTrail"]
          
```

Greengrass solves the protocol problem at the edge; SiteWise gives you a clean asset-modeled place to land the data once that's solved; SageMaker is where the actual prediction happens; S3 Object Lock plus CloudTrail is the GxP-relevant immutable trail underneath all of it.

## Where does Greengrass fit, and is it really the same pattern as the ROS 2 bridge?

I wrote years ago about [using Greengrass to bridge ROS 2 topics to AWS IoT Core](ros2-greengrass-iot-core) for mobile robot fleets, and the analogy here is genuinely apt, not a stretch: the core problem is identical. A physical device produces telemetry in its own native format, that format doesn't speak cloud-native protocols, and you need something running on-premises to do the translation before the data can usefully leave the building. For lab instruments, the native format is frequently not ROS 2 and DDS — it's proprietary serial protocols, vendor-specific USB command sets, or in the better cases, SiLA 2 — but the shape of the solution is the same edge-bridge pattern: a Greengrass component running on-site, close to the instruments, that speaks whatever protocol the instrument actually speaks, normalizes the telemetry, and forwards it into AWS IoT Core or directly into SiteWise's ingestion API.

Where lab instrumentation differs from the ROS 2 fleet case is the diversity of protocols you're bridging. A ROS 2 robot fleet is at least internally consistent — every robot in the fleet speaks the same DDS/ROS 2 stack. A lab's instrument fleet is not: you might have a liquid handler with a SiLA 2 interface sitting next to a plate reader that only speaks a proprietary RS-232 protocol from a vendor SDK a decade old, next to a robotic arm with its own vendor-specific control API. Each of those needs its own Greengrass component doing protocol-specific translation, and that's real integration work per instrument type, not a single generic adapter.

## What does SageMaker actually do with this data?

**SageMaker** is where the predictive part of predictive maintenance happens — training and running the actual models on the sensor streams SiteWise has modeled and structured. Two model types cover most of what you need here. Anomaly detection models (SageMaker has built-in algorithms suited to this, and it's also a reasonable fit for a custom model if the built-ins don't capture your instrument's specific failure signature) flag when a telemetry pattern — a vibration signature, a calibration-drift trend — deviates from the equipment's normal operating envelope, without you having to hand-define every threshold in advance. Remaining-useful-life (RUL) models go further, estimating how much service life is left on a component given its current wear trajectory, which is what actually lets you schedule maintenance around a predicted failure window instead of just reacting to an anomaly alert after the fact.

The practical integration path is training on historical SiteWise data (assuming you have enough failure history to learn from, which early in a program you often don't — cold-start is a real problem here, and leaning on manufacturer-published wear specifications as a prior is a reasonable stopgap until you've accumulated your own failure data) and then running inference on the ongoing stream, either batch (daily/weekly scoring) or near-real-time depending on how fast a given failure mode can progress from detectable to critical.

| Layer | AWS service | Job |
| --- | --- | --- |
| Protocol bridge | AWS IoT Greengrass | Per-instrument-type translation from proprietary/serial/SiLA 2 to normalized telemetry |
| Asset modeling | AWS IoT SiteWise | Structured asset hierarchy, derived metrics (rolling averages, drift trends) |
| Prediction | SageMaker | Anomaly detection, remaining-useful-life modeling |
| Immutable audit trail | S3 Object Lock, CloudTrail | Tamper-evident record of automated decisions for GxP review |

## How do you satisfy the GxP audit-trail requirement here?

Any automated maintenance decision or anomaly flag that could affect a regulated experiment needs to be reconstructible after the fact — who or what triggered a maintenance hold, what the sensor readings were at the time, what model version made the call. The practical pattern: land the raw telemetry and the model's inference output in **S3 with Object Lock** enabled, which enforces write-once-read-many immutability at the object level so a record can't be quietly edited or deleted after the fact, even by someone with otherwise broad account permissions. Pair that with **CloudTrail** logging every API call that touched the data or the SiteWise asset model, and you have the two pieces a GxP audit actually asks for: an immutable record of what the data said, and a trail of who or what accessed or modified anything around it.

**SiteWise does not make the protocol problem disappear — it just gives you somewhere clean to land the data once you've solved it.** The asset-model abstraction genuinely fits industrial and lab equipment well, and it's a meaningfully better fit than a bare time-series database for this specific problem. But a lot of lab instruments still speak serial or vendor-proprietary protocols with no SiLA 2 driver in sight, and no amount of SiteWise's data-modeling elegance reduces the actual engineering hours it takes to write and maintain a Greengrass component that correctly speaks RS-232 to a decade-old plate reader. Budget for that integration work as its own line item — it's usually underestimated because the SiteWise half of the story looks so clean in a diagram.

## What to carry away

AWS IoT SiteWise is a genuinely good fit for lab-equipment telemetry because its asset-model abstraction — properties, derived metrics, hierarchies — matches how lab instruments and their maintenance data are actually organized, better than a generic time-series store would. Greengrass does the same edge-bridge job here that it does for ROS 2 robot fleets: translating a device's native protocol into something the cloud side can ingest, except the protocol diversity across a lab's instrument fleet makes that integration work more varied, not less. SageMaker turns the SiteWise-modeled streams into actual anomaly and remaining-useful-life predictions, and S3 Object Lock plus CloudTrail gives you the immutable, auditable trail GxP review actually requires. The trap to avoid: assuming SiteWise's clean data model means the hard part is solved. The hard part — proprietary and serial protocol integration, one instrument type at a time — is still there, and SiteWise just gives you somewhere good to land the output once you've done it.
