Beetroot

Beetroot v1.5.1 — ML Code Detection and Search That Actually Works

Beetroot now uses VS Code's TensorFlow.js model to detect 54 programming languages. Plus a completely rewritten search engine — 'local v' returns 9 results instead of 98.

Two things have been bugging me about Beetroot for a while. First: you copy a Rust struct, open the preview, and it highlights as C. Or worse — plain text. The code detection was a pile of regex patterns that knew about five languages and guessed wrong on everything else.

Second: searching "local v" returned 98 results out of 1,100 clipboard entries. Searching "port 80" matched "import" because "port" is a substring of "import". The fuzzy search library was doing exactly what fuzzy search does — just on the wrong kind of data.

v1.5.1 fixes both.

At a glance:

  • ML language detection — 54 programming languages, same model as VS Code
  • Rewritten search — "local v" → 9 results instead of 98
  • Unicode word boundaries — "port" no longer matches inside "import"
  • Fragment preview — long clips show where the match is, not the first 100 characters
  • Bug fixes — source app tracking, object literal detection

Your Rust code shouldn't highlight as C

The old detection was a function called looksLikeCode() — a list of keywords like function, const, class, def. If your code had one of those, it got highlighted. If not — plain text. This worked for JavaScript and Python. It failed for everything else.

Rust without fn main? Plain text. Go interfaces? Sometimes misidentified. Swift? Plain text. Ruby? Plain text. PHP without <?php at the top? You guessed it.

The new detection uses @vscode/vscode-languagedetection — the same TensorFlow.js model that VS Code uses when you open an untitled file and it guesses what language you're writing in. Trained on millions of GitHub files, identifies 54 programming languages.

LanguageBeforeAfter
RustC or plain textRust
GomisidentifiedGo
Swiftplain textSwift
Rubyplain textRuby
PHP (no <?php)plain textPHP
{ key: value }JSONJavaScript

The model is under 1 MB, runs entirely locally in the app — no cloud calls, no API keys. First time it runs takes ~200ms to load, after that it's 10–50ms per detection with results cached.

Beetroot preview showing Rust code with correct syntax highlighting and "rust" language label detected by ML model

One thing I didn't expect: { name: "foo", count: 42 }. Is that JSON or JavaScript? The old regex said JSON — it has braces and colons. The ML model recognizes JavaScript object syntax. Small thing, but it means your JS snippets finally get the right colors.

Search that doesn't return everything

I wrote a whole article about the search rewrite — eight iterations, every wrong turn. The short version:

Fuse.js is a great fuzzy search library. For short strings — file names, contacts, menu items. Clipboard entries are different: code blocks, stack traces, URLs, often 200+ characters. On a string that long, the letters of your query are statistically likely to appear somewhere, scattered across the text. Fuse.js counts that as a match. That's why "local v" returned 98 results.

The fix: a 5-phase scoring system. Exact substring matches in your clipboard content rank highest. Word-boundary matches rank next. Window titles and app names rank lower. Fuzzy matching (for typos) ranks last. Everything is deduplicated — each item keeps only its best score.

QueryBeforeAfter
local v989
port 80131–2
lm st711–2
timeout~40~8

"port" no longer matches inside "import" because the search now uses Unicode-aware word boundaries. It knows that the "port" in "import" isn't a word start — and ignores it. Same logic works for Cyrillic, camelCase, and underscores.

And for long clips: the preview now shows where the match is — shifting the visible window to the matching fragment — instead of always showing the first 100 characters where the match might not be visible at all.

Bug fixes

Source app tracking — after waking from sleep, all new clips would show Beetroot as the source app instead of the actual window you copied from. The clipboard monitor was re-firing the same content as a "new" event, resetting the source. Fixed with content deduplication.

JS/TS object literals{ name: "foo", count: 42 } wasn't recognized as code at all (no function or class keyword). Now detected properly and highlighted.

How to update

Beetroot will offer to update automatically. Or download v1.5.1 from GitHub.

Discussion

No comment section here — all discussions happen on X.

Max Nardit

Max Nardit

@mnardit

More articles

How I Added 5 AI Providers to a Tauri Desktop App

Building multi-provider AI in Tauri: OpenAI, Gemini, Claude, DeepSeek, and Ollama. Three integration patterns, a Rust proxy for localhost, and why browser security made local models the hardest part.