Beetroot v1.5.1 — ML Code Detection and Search That Actually Works
Beetroot now uses VS Code's TensorFlow.js model to detect 54 programming languages. Plus a completely rewritten search engine — 'local v' returns 9 results instead of 98.
Two things have been bugging me about Beetroot for a while. First: you copy a Rust struct, open the preview, and it highlights as C. Or worse — plain text. The code detection was a pile of regex patterns that knew about five languages and guessed wrong on everything else.
Second: searching "local v" returned 98 results out of 1,100 clipboard entries. Searching "port 80" matched "import" because "port" is a substring of "import". The fuzzy search library was doing exactly what fuzzy search does — just on the wrong kind of data.
v1.5.1 fixes both.
At a glance:
- ML language detection — 54 programming languages, same model as VS Code
- Rewritten search — "local v" → 9 results instead of 98
- Unicode word boundaries — "port" no longer matches inside "import"
- Fragment preview — long clips show where the match is, not the first 100 characters
- Bug fixes — source app tracking, object literal detection
Your Rust code shouldn't highlight as C
The old detection was a function called looksLikeCode() — a list of keywords like function, const, class, def. If your code had one of those, it got highlighted. If not — plain text. This worked for JavaScript and Python. It failed for everything else.
Rust without fn main? Plain text. Go interfaces? Sometimes misidentified. Swift? Plain text. Ruby? Plain text. PHP without <?php at the top? You guessed it.
The new detection uses @vscode/vscode-languagedetection — the same TensorFlow.js model that VS Code uses when you open an untitled file and it guesses what language you're writing in. Trained on millions of GitHub files, identifies 54 programming languages.
| Language | Before | After |
|---|---|---|
| Rust | C or plain text | Rust |
| Go | misidentified | Go |
| Swift | plain text | Swift |
| Ruby | plain text | Ruby |
PHP (no <?php) | plain text | PHP |
{ key: value } | JSON | JavaScript |
The model is under 1 MB, runs entirely locally in the app — no cloud calls, no API keys. First time it runs takes ~200ms to load, after that it's 10–50ms per detection with results cached.

One thing I didn't expect: { name: "foo", count: 42 }. Is that JSON or JavaScript? The old regex said JSON — it has braces and colons. The ML model recognizes JavaScript object syntax. Small thing, but it means your JS snippets finally get the right colors.
Search that doesn't return everything
I wrote a whole article about the search rewrite — eight iterations, every wrong turn. The short version:
Fuse.js is a great fuzzy search library. For short strings — file names, contacts, menu items. Clipboard entries are different: code blocks, stack traces, URLs, often 200+ characters. On a string that long, the letters of your query are statistically likely to appear somewhere, scattered across the text. Fuse.js counts that as a match. That's why "local v" returned 98 results.
The fix: a 5-phase scoring system. Exact substring matches in your clipboard content rank highest. Word-boundary matches rank next. Window titles and app names rank lower. Fuzzy matching (for typos) ranks last. Everything is deduplicated — each item keeps only its best score.
| Query | Before | After |
|---|---|---|
local v | 98 | 9 |
port 80 | 13 | 1–2 |
lm st | 71 | 1–2 |
timeout | ~40 | ~8 |
"port" no longer matches inside "import" because the search now uses Unicode-aware word boundaries. It knows that the "port" in "import" isn't a word start — and ignores it. Same logic works for Cyrillic, camelCase, and underscores.
And for long clips: the preview now shows where the match is — shifting the visible window to the matching fragment — instead of always showing the first 100 characters where the match might not be visible at all.
Bug fixes
Source app tracking — after waking from sleep, all new clips would show Beetroot as the source app instead of the actual window you copied from. The clipboard monitor was re-firing the same content as a "new" event, resetting the source. Fixed with content deduplication.
JS/TS object literals — { name: "foo", count: 42 } wasn't recognized as code at all (no function or class keyword). Now detected properly and highlighted.
How to update
Beetroot will offer to update automatically. Or download v1.5.1 from GitHub.