Incoming: Branches, Revisions, and Layers

Announcing: Branches and Revisions!

Pollen Sense has grown a lot over the last few years! As we approach our 100 millionth frame, wrangling all of the frames and tags has evolved (stay tuned for exciting updates on that 100 millionth frame!). This project has been months in the process, codenamed LIMB.

What This Means for You

To save you some reading, this means that we’ll be able to keep track of dataset improvements over time! Either by request or for our own continuous improvement, we re-run historical frames against new AI models to improve data quality. Branches and revisions enable us to now store and track those data quality improvements over time!

At launch, there is a single branch and revision: PS for “Pollen Sense” and a revision numbered 1. We are still working on some areas of our data pipeline to use Branches and Revisions, such as having a staging area so that AI datasets can be published at the same time, or releasing changes to our geospatial emissions model (GEM) in batches. Thus, Revision 1 will contain some changes both to datasets as well as GEM in it. Future revisions will not experience mid-revision changes and will trigger a new revision.

A couple of notes

V1 Sensor APIs are being deprecated, as the response format as well as request parameters had to change. New data will no longer be available after March 11,2026 and the APIs are scheduled to be removed no later than March 26, 2026.
We are still in the process of back-filling all almost-100 million frames into the PS-1 Branch/Revision; In the interim, old portal UIs and V1 Sensor APIs can be used to access that historical data. If we have not completed the back-filling by March 26, 2026, the V1 Sensor APIs will be left up.
The Metrics tab in the portal has received some significant upgrades! When viewing historical data, you no longer need to hunt for when a given site had provisioned sensors, the UI will guide you!

For Those That Love the Details

A little on how our AI-powered airborne particulate recognition system works: like humans, AI can struggle to detect differences between similar looking particulates. When a scientist is counting under a microscope, they use the context they’ve built up to know that a certain white-blob is going to be a cedar particle because of the time of year, where they are located, etc. Using that same principle, we’ve built a system (codenamed GEM - Geospatial Emissions Model) to know what category datasets to include in a given AI model.

These GEM learnings evolve over time, as do the datasets of tags fed to the AI models. When the LIMB project is fully complete, we will be able to track these changes to GEM as well as datasets and much more.

Branches are named processing lines (for example, the core "PS" branch or customer-specific branches). Revisions are versioned releases within a branch, each tied to a concrete processing configuration and effective timestamp. Another concept introduced is Layers.

Layer 10 represents exactly what the AI vision models detected without any modifications.
Layer 20 “layers in” comparing a given tag against our GEM model to filter out false positives at the beginning and ending of seasons by requiring the AI model to reach a certain confidence threshold for the tag to count. In situations where it falls below the threshold, the tag is instead counted as generic dust
Layer 30 includes the layering done by Layer 20 and adds in any human-performed corrections

This model provides direct benefits:

Reproducibility: You can request the same branch and revision later and get identical results.
Comparability: You can compare revisions to quantify the impact of model changes.
Safety: Updates do not overwrite existing datasets; they add a new revision instead.
Flexibility: Customer-specific branches can coexist with the core processing branch.
Clarity: Every dataset has a clear lineage and version identity.

API deprecation and the path forward

The v1 Sensor Owner API endpoints are deprecated. This move is not just a change in routes, it is a shift to a more reliable and transparent data model. The v2 endpoints read from the branched metrics pipeline, which means revisions are tracked and stored, reports directly from vision tagging are available, and so much more! Oh, and we snuck in a couple of frequently asked API endpoints too 😊

Benefits of moving to v2 include:

Stable versioning: Branch and revision parameters let you lock in results.
Future-proof design: New processing improvements appear as new revisions, not silent rewrites.
Higher confidence: You always know which processing version produced the data.
A single source of truth: v2 is aligned with the portal and internal services.

‍