Topics

The deep-dives grouped by theme. Each article is a standalone, internals-first read — start anywhere, or follow a cluster end to end. Prefer chronological? The full archive has a timeline and tag filter.

OLAP & Analytical Engines

How columnar warehouses and real-time analytics engines actually execute queries — storage, vectorization, and the trade-offs between them.

ClickHouse

The ClickHouse deep-dive series: architecture, schema and query optimization, and real-time ingestion.

Streaming & Real-Time Data

Event streaming and stream processing — logs, exactly-once, watermarks, CDC, and streaming databases.

Kafka & Event Streaming

Kafka internals and the architectures around it — why it's fast, how it scales, and how alternatives differ.

Snowflake

Snowflake from internals to real-time pipelines, Cortex AI, and regulated data-vault builds.

Databricks, Spark & Lakehouse

The Spark execution model, Photon, the Delta log, performance tuning, and the lakehouse platform.

Business Intelligence & Semantic Layers

The engines and practices behind BI — Tableau (VizQL, Hyper), Power BI's VertiPaq, Microsoft Fabric, and semantic-model design.

RAG & Retrieval

Retrieval-augmented generation end to end — retrieval types, vector search internals, GraphRAG, and cloud builds.

AI Agents & LLM Systems

Agents, agent memory, LLM inference and serving, observability, MCP, and the Transformer that started it.

MLOps

The ML lifecycle in production — experiment tracking and registries, feature stores, and model serving.

Governance & Compliance

Lineage as regulatory proof, auditable AI over sensitive data, and lakehouse governance with Unity Catalog.

NoSQL & Distributed Databases

LSM-tree storage, consistent-hashing rings, inverted indexes, and the CAP trade-offs behind NoSQL stores.

Cloud Data Platforms (AWS, Azure, GCP)

Platform-specific data services, integrations, and real migration war stories across the three major clouds.

State of the Industry

Annual retrospectives on data and AI engineering — what actually changed each year.

Data Architecture & Engineering

Cross-cutting architecture, modeling, and platform-engineering pieces.