Robotics and Reinforcement Learning tinkerer.
brandonrohrer.org
Wrangler of algorithms.
Eater of bread. Sipper of whisky.
Reports to a Shih Tzu.
Brandon Rohrer
Loading...
Of course all of this bumping up against limitations is just setup for the next installment where the underdog model gets upgrades and overcomes impossible odds to achieve less mediocre performance.
Sometimes AI scraper bots need a rest. You can give them a peaceful little playground to romp around in. Just drop this URL as a tiny hyperlink somewhere on your blog.
Reviews are flooding in. Bots like this tarpit so much that they never leave.
scienceispoetry.net
Another limitation that surfaced was that the training dataset was laughably undersized. Even doubling it made barely a dent in performance measures.
Representing a 2nd order model with my 20,000-element dictionary required an 8 trillion-element array. It scales as O(n^3). I had to pare the dictionary down to a token alphabet of 1,000 to get it running. With 1/20th the dictionary size performance went down, even with the higher order model.
I don't often brag about my company, but today I make an exception. Our CEO, folks:
youtu.be/36jxNeV5L1Q?...
(part of the ongoing Reckless Ben / Bricks and Minifigs / American Fork PD drama)
Brandon Rohrer
The earliest Large Language Models were Markov models (a.k.a. Markov chains). They are like LLMs with a context window of 1 token. As you might guess, they are OK at some things but not awesome.
I'm building an Artisanal Language Model, something like an ALM but locally sourced. Home grown. It trains and serves entirely from my laptop.
The task its built for is proofreading English prose.
It's training corpus (as of the current version) is 10 classic sci-fi novels.
I have two evals in place currently: spelling and capitalization.
A first-order Markov model performs much better than the random baseline (yay!) but has a lot of room for improvement. Watch this space for future iterations.
Table shows model performance as
(precision %) recall %.
details in the blog: brandonrohrer.org/alms_fomm.html
Old school language models, part 2: Second-order Markov Models
This is the next stop in my Artisanal Language Model development tour. If a first-order Markov model has a context length of 1, then a second-order has a context length of 2.
brandonrohrer.org/alms_somm.html