If you work in fintech and process card transactions, you've seen MCC codes. Merchant Category Code- a four-digit number that tells you what kind of business charged the card. 5411 is a grocery store. 5812 is a restaurant. 7995 is a casino.
Every product that touches transactions eventually needs to turn these codes into something a human can understand. Cashback rules, spend controls, analytics dashboards, compliance reports- they all need the same thing: given a code, what category is this?
I've built this mapping three times in production. Three separate microservices at the same company, each with its own copy. When one got updated, the others didn't. When the business asked "why is 7995 categorized as Entertainment in the app but Gambling in the compliance report?"- nobody had a good answer.
So I extracted it into a NuGet package. Then I looked at what I had actually built, benchmarked it, and rewrote it.
What v1 looked like
The first version used a FrozenDictionary<string, MccCategory>. Seemed reasonable- build once, read forever, immutable.
// v1
private static readonly FrozenDictionary<string, MccCategory> Codes = BuildCodes();
public static MccCategory Categorize(string mccCode)
=> Codes.GetValueOrDefault(mccCode, MccCategory.Other);
It worked. But there were three problems I didn't like.
First: every call allocates or assumes the caller already has a string. In a high-throughput service processing thousands of transactions per second, that adds up.
Second: FrozenDictionary hashes strings. But MCC codes are 4-digit numbers. There's a faster structure for this- one that doesn't need hashing at all.
Third: there was no way for a team with non-standard mappings to override anything without forking the library.
What v2 actually does
MCC codes are integers in the range [0, 9999]. That's a fixed-size space. So instead of a dictionary, v2 uses a plain array indexed directly by the numeric code value:
internal const int TableSize = 10_000;
private readonly MccCategory[] _codes; // indexed by MCC integer
private readonly bool[] _occupied; // tracks which slots are in the taxonomy
Lookup is a direct array read:
public MccCategory Categorize(int mccCode)
{
if ((uint)mccCode >= TableSize || !_occupied[mccCode])
return MccCategory.Uncategorized;
return _codes[mccCode];
}
No hash function. No bucket traversal. One bounds check, one bool check, one array read. That's it.
Three overloads, one of which is allocation-free
The real-world caller is often processing a raw string from a payment event, but might have an integer from a database column or a span from a buffer. v2 handles all three:
MccLookup.Categorize(5411); // int- fastest path
MccLookup.Categorize("5411"); // string- still zero-alloc after parse
MccLookup.Categorize(span); // ReadOnlySpan<char>- allocation-free
The ReadOnlySpan<char> overload uses a custom parser instead of int.TryParse:
private static bool TryParseMcc(ReadOnlySpan<char> s, out int value)
{
value = 0;
if (s.Length == 0 || s.Length > 4)
return false;
var result = 0;
for (var i = 0; i < s.Length; i++)
{
var c = s[i];
if (c < '0' || c > '9') return false;
result = result * 10 + (c - '0');
}
value = result;
return true;
}
Why not int.TryParse? On netstandard2.0, the ReadOnlySpan<char> overload of int.TryParse doesn't exist. This custom parser works identically across all target frameworks, is measurably faster on the hot path, and rejects non-digit characters explicitly.
Extensibility via IMccLookup and WithCustomCodes
The built-in taxonomy covers ISO 18245. But teams have non-standard codes- internal test MCCs, network-specific extensions, or just a different opinion about which category 5962 belongs to.
v2 exposes IMccLookup and lets you override without touching the default:
IMccLookup custom = MccLookup.WithCustomCodes(new Dictionary<int, MccCategory>
{
[9999] = MccCategory.Finance, // add an unknown code
[7995] = MccCategory.Leisure, // override the built-in mapping
});
// The built-in default is not modified- it's immutable
custom.Categorize(9999); // Finance
MccLookup.Categorize(9999); // Uncategorized- unchanged
WithCustomCodes clones the underlying arrays and applies overrides. The original instance is untouched. Thread-safe by design- no locks, no shared mutable state.
The pre-built category index
GetCodes(MccCategory.Airlines) needs to return all ~351 airline codes. In v1 this meant scanning the entire dictionary. In v2, a secondary index is built at construction time:
private readonly Dictionary<MccCategory, int[]> _codesByCategory;
So GetCodes and GetCodeValues run in O(N) over the codes in that category- not O(10000) over the whole table.
What the package gives you
// Most common: categorize by string
var cat = MccLookup.Categorize("5411"); // Supermarkets
// By integer (no string involved)
var cat = MccLookup.Categorize(5411); // Supermarkets
// Distinguish known from unknown
if (MccLookup.TryGetCategory("5812", out var c))
Console.WriteLine(c); // FoodAndDining
// Reverse lookup
var codes = MccLookup.GetCodes(MccCategory.Airlines); // "3000".."3350" + 2 more
var ints = MccLookup.GetCodeValues(MccCategory.Airlines);
// Custom overrides
IMccLookup lookup = MccLookup.WithCustomCodes(overrides);
No dependencies. No DI registration. Targets .NET 6.0+ and .NET Standard 2.0.
Numbers
- 27 categories, ~900 MCC codes (ISO 18245 + network-specific)
- Array-based O(1) lookup, zero allocations on hot path
-
ReadOnlySpan<char>overload for buffer-friendly callers -
WithCustomCodesfor teams with non-standard mappings - .NET 6.0, .NET Standard 2.0, MIT license
The package is on NuGet and the source is on GitHub.
The v1 was fine. The v2 is what I'd actually want to use in a service that handles thousands of transactions per second. Sometimes you need to ship something to understand what it should have been.
This article was originally published by DEV Community and written by KitKeen.
Read original article on DEV Community