Inside BidMatrix: DeepThink & ApexAd

APP MARKETING, MARKETER TALKS

Oct 1, 2025

BidMatrix’s stack centers on two flagships: DeepThink (training & serving) and ApexAd (intelligent delivery). Both are powered by our data foundation—unified profiles and scenario labels that organize online, offline, and selected third-party signals for learning and activation. At RTB scale, the system ingests heterogeneous signals, learns incrementally to track interest migration, exports models via ONNX, and serves low-latency predictions using ONNX Runtime. A staged model evolution—LR → FM → lightweight DNN—delivered +8pp AUC (offline) and translated to ~+30% revenue uplift (production).

Personalization at the Edge of Latency

RTB is an unforgiving place for ML: distributions drift hourly, latency budgets punish complexity, and sparse, high-cardinality features dominate. Most stacks compromise—either simplify the model to meet p99 or over-engineer features to prop up constrained learners.

BidMatrix refuses that trade-off. DeepThink hydrates sparse signals into learnable structure at scale, and ApexAdexecutes compact neural predictors inside RTB latency windows. The glue between them is our data foundation(profiles & labels) that keeps personalization grounded in real business scenarios.

1) DeepThink: Training & Inference at RTB Scale

What it is. DeepThink is our self-built ML fabric for ingest, feature construction, distributed training, model export, and real-time serving.

Ingest & features. We unify multi-source signals into sparse/dense tensors:

User history: impressions, dwell, clicks, conversions, recency/frequency
Ad/creative: format, style, brand, vertical, campaign lineage
Context: device/OS, network, geo, time-of-day/weekday, marketplace dynamics

Feature stores are optimized for high-cardinality hashing, frequency capping, and temporal bucketing, preserving long-tail signal without exploding memory.

Training modes.

Single-machine for ablations and candidate architectures
Distributed (parameter server + sharded datasets) scaling to PB-class samples and 10⁸–10¹⁰ effective feature magnitudes
Incremental learning (daily or faster) to chase concept drift

Modeling. Scenario-tunable learners: LR baselines; FM for interaction recovery; lightweight DNN for non-linear structure under strict latency. The production ensemble is compact to protect p99.

Serving. Models export to ONNX and run on ONNX Runtime across an autoscaled cluster with cache-aware placement and request coalescing—yielding low-latency CVR/CTR predictions at high QPS.

Data Foundation for Training

DeepThink consumes a unified data foundation: online, offline, and select third-party signals are stitched into user profiles with business-scenario labels (attributes, behaviors, interests, geo). This structure improves feature quality, stabilizes sparse namespaces, and accelerates convergence for LR/FM/lightweight-DNN training.

2) ApexAd: Feature System, Model Evolution & Delivery

Tri-modal feature system.

User history (views • clicks • purchases)
Real-time context (device • OS • network • geo • time)
Ad features (creative type • stylistic attributes • vertical cues)

Model evolution. LR → FM → lightweight DNN.

LR provides calibrated baselines and cheap inference
FM recovers key pairwise interactions in sparse domains
A compact DNN captures non-linear manifolds without violating RTB latency

Incremental updates. Parameters refresh daily (or faster), tracking interest migration and traffic regime shifts.

Cold-start strategy. Transfer learning over proximate cohorts and industry-level embeddings provide non-zero priors for new users/campaigns; performance then converges rapidly as live data accrues.

Data Foundation for Delivery

ApexAd activates the same profiles & labels at serve time—powering precision audience building, lookalikes, and real-time personalization. The outcome is earlier pre-locking of high-value users and steadier allocation under volatility.

3) Performance Evidence

Offline: architecture shift to lightweight DNN delivered +8pp AUC over strong LR/FM baselines
Production: earlier locking on high-value users mapped to ~+30% revenue uplift at stable spend
Latency discipline: vectorized ops, cache locality, and ONNX Runtime kernels hold p99 within budget

4) Release History

2024-04 — DeepThink Alpha (single-machine LR)
2024-06 — DeepThink General Release (distributed training + serving; +8pp AUC vs. baseline)
2024-07 — S1: First LR conversion model online
2024-08 — ApexAd intelligent delivery platform
2024-09 — S3 deep conversion model (CVR +15%+)
2024-10 — S7 deep model (loss-enhanced) (CR/CVR +15%; batch-size normalization; 1-day event latency)
2024-11 — Data Foundation GA (profiles & labels) — supports DeepThink & ApexAd
2025-01 — User-side features integrated into S1/S3/S7 (personalized ad delivery)
2025-01 — S1 upgrade: LR → FM (AUC +9.0; online conversion +30%)
2025-02 — Inference performance optimization (lean structures; 5–6× speedup)
2025-03 — S3/S7 add in-house ad-label features (AUC +1.5; most dims +30%↑; 3 dims CR ×2)
2025-07 — S1 upgrade: FM → Deep Neural Network (AUC +4; deep models platform-wide)
2025-08 — Serving performance tuning (fix I/O bottleneck; 10× faster inference)

Why It Works: A Design That Respects Reality

Statistical leverage at scale. PB-class samples and billion-feature sparsity surface signals the market hasn’t priced in.
Tight serving loop. Predictions must land before bidder timeouts; ONNX-based serving is engineered to that ceiling.
Cadenced adaptation. Incremental updates prevent long-horizon bias as interests migrate.
Scenario-driven data foundation. Profiles & labels are built to feed the learner and drive activation, not to decorate dashboards.

System Deep-Dive (for Practitioners)

Feature hashing & collisions: controlled regimes keep memory bounded while preserving separability for high-value namespaces
Temporal features: recency/seasonality encodings capture periodicity beyond linear reach
Regularization & calibration: L2 + dropout for the DNN head; post-hoc calibration ensures monotone bidding curves
A/B discipline: multi-cell splits isolate model vs. allocation effects; we track incremental revenue per thousand impressions (iRPM), not vanity CTR

Security, Privacy & Governance

Data minimization: only features with measurable lift are promoted
Pseudonymization & access control: strict scoping for identity joins
Explainability surfaces: per-feature contribution summaries for audit and troubleshooting
Compliance: aligned with prevailing privacy regimes and partner obligations

Roadmap

Creative embeddings: compact vision encoders for frame-level signals under strict latency
Causal uplift modeling: treatment-effect estimators to de-bias observational feedback loops
Adaptive refresh cadence: auto-tuned increment frequency by segment drift
Cross-device graph strengthening: probabilistic joins with conservative thresholds to protect precision

BidMatrix is built for the real RTB world: brutal latency budgets, shifting distributions, and sparse signals. DeepThinkscales the learning and ApexAd delivers—quietly, relentlessly—at the edge of latency. Net effect: we spot value earlier, allocate more steadily in volatile markets, and compound performance as conditions change. How does this translate into acquiring more—and better—users and driving more revenue for advertisers?

Save $$$ on user acquisition with smart targeting

[ ADDRESS ]

112 ROBINSON ROAD #03-01

ROBINSON SINGAPORE

[ MAIL ]

bd@bid-matrix.com