Posts

Why I Didn’t Start with Spark, Pinot, or ClickHouse

Image
This question comes up a lot, so it’s worth answering directly: If you’re doing analytics at scale, why not Spark, Pinot, or ClickHouse? They’re powerful tools. They’re also optimized for a different problem . The Kind of Scale That Matters There are two kinds of scale: Data volume Semantic complexity Most modern analytics stacks optimize for the first. This work is mostly about the second. Spark: Great for Pipelines, Not Reasoning Spark excels at: Batch processing Large transformations Schema-on-read workloads But semantic analytics needs: Low latency Fine-grained validation Interactive feedback You can build that on Spark — but you’ll spend most of your time: Managing jobs Handling latency Debugging execution graphs It’s a mismatch for question-driven systems. Pinot and ClickHouse: Fast, But Opinionated Pinot and ClickHouse are impressive. They shine when: Queries are known in advance Dimensions are stable Aggrega...

Caching as a Semantic Problem: Redis, Pre-Aggregation, and Mixed Granularity Data

Image
  Caching is usually treated as a performance hack. Add Redis. Cache query results. Invalidate aggressively. Hope nothing breaks. That approach works, until you introduce semantics . Once users can ask the same question in multiple ways, simple query caching starts to fall apart. Why Query Caching Breaks Down Consider these two questions: “Sales by category by month” “Monthly sales for product categories” Syntactically different. Semantically identical. A traditional cache sees: Two queries Two cache keys Two entries A semantic system should see: One intent One reusable result This is where caching stops being technical and starts being conceptual. The Shift: Cache Meaning, Not Queries Instead of caching: SQL strings Serialized result sets The idea is to cache: Semantic tuples At defined grains With known aggregation rules For example: (Time:Month, Product:Category, Measure:Sales) Once that tuple exists, many ques...

The Foundation: How I’m Building a Real-World Ordering Demo for Deep EF Core & Lambda Exploration

Image
How I’m Building a Real-World Ordering Demo for Deep EF Core & Lambda Exploration Before I write a single deep dive about lambda expressions , expression trees , or EF Core translation, I want to be very deliberate about what those examples live inside. This repository will evolve as the series progresses, each post builds on the same codebase: OrderDemo – Lambda & EF Core Deep Dive Demo Because lambdas don’t exist in isolation. They behave very differently depending on: execution context data source architectural boundaries and how queries are composed over time So instead of starting with syntax, I’m starting with a real application, one that’s complex enough to break if you misuse lambdas, but still small enough to understand. In this post, I’ll walk you through: The demo application I’m building The architectural choices I’ve made And why each decision matters for the rest of the series

A Secure Blazor Server Azure Deployment Pipeline

Image
I wanted to kick off the year with something practical, modern, and production-ready. This post is based on my own experience deploying my new website, dotnetconsult.tech , using GitHub Actions into Azure App Service . In this post, I’ll walk through how to set up a secure development and deployment flow using: Blazor Server GitHub Azure App Service GitHub Actions with OpenID Connect (OIDC) This approach avoids stored secrets entirely and reflects how I now set up every new .NET project  whether it’s for a client, a SaaS product, or an internal tool.

A Practical Stack for Semantic Analytics: Azure, C#, Redis, and PostgreSQL

Image
Happy New Year 🎉 Welcome to  2026 , and welcome to the first post of the new year. Part 1 Available Here Part 2 Available Here Part 3 Available Here Up to now, I’ve deliberately avoided talking about technology. Not because it doesn’t matter,  but because architecture should follow meaning , not the other way around. That said, once you commit to a semantic-first approach, certain technology choices start to make more sense than others. This post is about the stack I’ve been gravitating toward and why . Why “Boring” Technology Is Often the Right Choice Semantic analytics systems are already complex conceptually. Adding novelty at the infrastructure layer tends to: Increase cognitive load Reduce debuggability Make correctness harder to reason about For this kind of system, I’ve been prioritizing: Predictability Strong typing Explicit boundaries Operational clarity That naturally pushed me toward a stack built around Azure, C#, Redis, and Postg...

Stop Wrapping EF Core in Repositories: Use Specifications + Clean Architecture

GitHub project which accompanies this article is available here    Wrapping Entity Framework Core in repositories has become a default in many .NET codebases. But defaults deserve to be challenged. This post shows: why repository-over-EF breaks down how Clean Architecture + Specification Pattern fixes it and how EF Core InMemory tests prove the approach works The problem with “Repository Pattern over EF Core” EF Core already gives you: DbSet<T> → repository behavior DbContext → unit of work Yet many projects add another repository layer anyway. It usually starts simple: Add(order) GetById(id) Update(order) Then order-processing requirements arrive: “Get open orders for customer” “Get orders awaiting payment older than 7 days” “Get paged orders sorted by date with items” Soon you’re staring at this: OrderRepository ├── GetOpenOrdersForCustomer(...) ├── GetOrdersAwaitingPayment(...) ├── GetOrdersWithItemsAndPayments(...) ├── ...

From Question to Result: A Mental Model for Semantic Analytics

Image
  Part 1: A Thought Experiment: What If Analytics Models Were Semantic, Not Structural? Part 2 : Using a Semantic Model as a Reasoning Layer (Not Just Metadata) Let’s take a concrete example: “Compare forecast vs actual sales by month for bike categories this year.” Most systems jump straight to execution. I think that’s backwards. Step 1: Understand the Question Before touching data, the system identifies: Measures: Forecast, Actual Sales Time range: This year Time level: Month Product level: Category Filter: Bike-related categories This is interpretation, not computation. Step 2: Validate Meaning Next, the system checks: Are forecast and actual comparable? Is category a valid rollup for both? Is monthly aggregation defined? Are defaults available where ambiguity exists? If something is unclear, the system can explain why . Step 3: Decide How to Answer Only now does execution matter: Cached aggregates Precomputed tuples On-th...

Using a Semantic Model as a Reasoning Layer (Not Just Metadata)

Image
  Part 1: Using a Semantic Model as a Reasoning Layer (Not Just Metadata) Most systems treat semantic models as documentation. But what if they were active participants in query execution? This question came up while thinking about natural language analytics. Natural Language Is Ambiguous by Default If someone asks: “Show me sales by product this year” There are immediate ambiguities: Which sales measure? At what product level? Calendar or fiscal year? Gross or net? Most tools resolve this by: Picking defaults Or asking the user to clarify But what if the system could reason about the question? The Semantic Model as Context Imagine a model that explicitly defines: Valid measures Valid rollups Default aggregation logic Synonyms and aliases Before executing anything, the system can ask: Is this aggregation valid? Are these measures comparable? Does this level change meaning? This shifts failure from: “The numbers look...

A Thought Experiment: What If Analytics Models Were Semantic, Not Structural?

Image
I’ve been thinking a lot about why analytics and forecasting platforms feel harder to use than they should. Not harder to build,  harder to think with . Most modern data stacks are incredibly capable: Fast databases Columnar storage In-memory caches ML forecasting libraries Yet the questions users struggle to answer haven’t changed much: “Compared to what?” “At what level?” “Is this rolled up correctly?” “Why does this number look wrong?” This feels less like a compute problem and more like a meaning problem . Where Meaning Lives Today In most systems I’ve worked on, meaning is scattered across: Database schemas ETL logic Cube definitions BI metadata Tribal knowledge None of this is explicit. If someone asks: “Can we compare forecast vs actual by category this year?” The system doesn’t reason about that question. It executes SQL and hopes the result makes sense. A Different Framing What if we treated analytics as a sem...