Every P25 radio transmission carries a Radio ID — a numeric identifier unique to the individual radio unit. If you can associate RIDs with agencies, you know not just that "someone is transmitting on a police talkgroup" but that "this specific radio belongs to the Frederick County Sheriff’s Office." That association is the basis for a proprietary data asset that compounds over time: the more traffic you observe, the more RIDs you can classify, and the more precise the platform’s identity resolution becomes.
The system has two components. A harvester script tails OP25’s control channel decoder output and records every (RID, TGID) pair it observes into a PostgreSQL table. Since talkgroup-to-agency mappings are already known from RadioReference, every observation is an implicit vote: a RID that keys up 500 times on FCSO talkgroups and twice on interop channels has a very high-confidence Sheriff’s Office association.
The analyzer runs against the accumulated data and performs three tasks. First, it assigns each RID its primary agency by majority vote across all observed talkgroup associations, with a confidence score based on vote concentration. Second, it groups RIDs by numeric prefix and measures prefix purity — what fraction of RIDs sharing a prefix belong to the same agency. Third, it tests different prefix lengths to find the optimal discrimination boundary.
The initial backfill of 1,108 unique RIDs revealed that the four-digit prefix is the sweet spot. At three digits, nearly everything collapses into catch-all bins (240xxx, 241xxx) with 20% purity. At four digits, six pure prefix groups emerge out of twelve total: 2401 maps to FCSO (155 deputies), 2413 to FPD (102 officers), 2411 to DPW (96 units), 2407 to EMS, 2408 to Fire, and 2415 to Transit. The existing hand-seeded three-digit prefix table was partially wrong — the data showed that agency boundaries emerge at the fourth digit, not the third.
The prefix mappings feed directly into the engine’s RID resolver, which now resolves agency identity server-side from RID prefixes rather than relying on talkgroup matching alone. When a RID’s agency differs from the talkgroup’s agency (common on interop or mutual aid channels), the RID identity wins the display label. This is more reliable because the RID is permanently assigned to a specific radio, while talkgroups are shared resources.
The harvester runs continuously as a background process, tailing OP25’s existing stderr output with no additional SDR load. After a few days of passive collection, the analyzer can export updated prefix mappings with statistically validated confidence scores. The longer it runs, the more complete the RID-to-agency mapping becomes — a classic data flywheel where the product improves simply by being used.