Recent Messages

2025-07-14 16:55:00

A question, from a friend of the blog:

If you were at a table with 10 engineering leaders with plenty of time to go deep, what roundtable questions would you ask them to draw out their best insights?

LLMs are changing the process of writing code. How are these changes impacting the job of managing engineers?
To what extent are LLMs going to transform your business entirely?
Will the build v. buy dynamic be fundamentally altered?
Are the scopes of projects changing because of LLMs?
Can you use LLM tools to summarize the performance of individual contributors? Would this approach ignore important non-coding tasks performed by ICs?
How have your hiring interviews changed as a result of LLMs? Do you encourage the use of LLMs in interviews? Have the "work-sample" interviews changed? Are "puzzle" interviews obsolete?
Are you under pressure to decrease head-count as a result of LLMs?
Do LLMs allow your organization to have higher standards regarding code style, test coverage, etc.?

monango, part 3 Cities

2025-07-14 16:32:34

regarding the last feature list:

a few simple fixes are done. ⚙️ have a way of showing/counting the available words not yet "exposed" in Journey Mode, a more-linear path through the lessons, improved "You're doing great" interstitials

the "per-day stats" failed miserably on the first attempt. the second attempt will be entirely server-based. but that is dependent on a few other changes.

a new mode was added: listen to 4 words, click them 🔥 in order; although that isn't required yet. 💡 there is a balance between "does it work", "is it useful", and "does it motivate other people to use the app".

i am debating "additional languages" as a new feature.

every language 🔥 at this point has a few "special" features that need to be managed.

The app might need a new name, or multiple names. Also, the fact that I'm not sure how Trakaido is pronounced could be an issue. I assumed it would rhyme with Hokkaido, but others disagree ...

Another possible feature is educational minigames for 3rd graders, but in a foreign language. "Choose the animals!" 💡 or, "how warm is dvidesimt degrees Celsius

Also, there might be an "enable experimental features" setting. 💡 because "optimizing the UI for some new features" isn't happening this week. But I need them, and other people are confused by them.

The other task probably not happening this week is "audio quality-control". I still don't have a lead for an LLM that will scan an audio-file and make sure it is the exact word, pronounced correctly, without any breathing noises etc.

monango, part 2 Cities

2025-07-09 23:39:49

Three of the Trakaido fixes are done: the hover issues, the multiple-choice answer count issues, and the 🔥 forgotten last dispatch, but also important improved word selection algorithm issues.

The word-lists are somewhat polished, but that hasn't been updated in prod yet.

The React re-draw bug is not fixed. I may need a smarter LLM to fix the bug; the ones I am using aren't finding it. 💡 the solution is probably to define a "canvas" area for the question, and redraw it completely for each question. This will require fixing the fact that the "question type" takes up an excessive amount of screen real-estate

💡 There is also a plan to re-generate the word lists entirely. But, a few key issues remain before I can run the "ask the LLM to regenerate data" step. For example: "are these two definitions of the same word?"

Additional "short" features, for tomorrow, include:

have a way of showing/counting the available words not yet "exposed" in Journey Mode.
have "Drill Mode" expose all words in a corpus if you do well enough
a more-linear path through the lessons. 💡 a full "tech tree" style path could be done. Duolingo got rid of that, for some reason.
make some of the "You're doing great" interstitials remind you about app features ⚙️ such as "don't type the text in parentheses" and grammar ⚙️ such as "Here is the pronoun table, with audio"
per-day stats: questions answered, words reviewed, words at each "level" of comprehension
track "last exposed" and "last correct answer"
an option to "always match pronoun" on two-word compounds like "He ate".

specialization is for insects US Constitution

2025-07-09 16:47:02

We live in a world that expects everyone to be specialized in politics. This is a worrisome thing. How can a democracy work when a majority of the people are disengaged from the national issues that drive voting?

Perhaps the answer is that it can't; a more staged system, where the people elect ward/city chiefs, and the chiefs vote on a candidate, might be better. 💡 but the flaw there is obvious; the same capture as the various political parties have had at their National Convention regarding the Presidential Candidate. ⚙️ it becomes polarized. 100 years ago, people could be selected for the convention before anyone knew the candidates for the Presidential nomination. Today, each faction selects the luminaries who will be required to vote for their candidate.

The idea of "governance by per-profession representatives" has its own flaws.

We can imagine a "House of Lords" style setup, with (say) 200 seats, 12 of which are for medical doctors. Initially, they may be selected based on their skill in medicine 🔥 and at getting elected. But, after the passage of time, they will be selected based on their loyalty to a faction. 💡 this is how Dr. Oz is in charge of Medicare.

I have not yet found any flaw in the "governance by per-birthyear representatives".

For example, on a STV system, all people born 1970-1975 select 8 representatives who are members of that cohort. With a provision for alternates should a member suffer untimely death, or (in the case of an inferior assembly) leave the jurisdiction.

The technical problems are addressable. No representation until the age of 10, "parental representation" (where the candidates are not members of the cohort, but parents of those members) until the age of 20. At the age of 70, they elect "permanent members"; no further elections, but no replacement on death.

monango, part 1 Cities

2025-07-09 16:30:58

Monango, North Dakota is a small rural town located in Dickey County, in the southeastern part of the state. With a population of under 50 residents, it's one of the least populous communities in North Dakota. The town is surrounded by wide stretches of farmland and prairie, reflecting its strong agricultural roots.

I have some more ambitious tasks for Trakaido planned. New modes, new languages, new webdomains, new spaced repetition algorithms. 🔥 I don't think of it as "spaced repetition". It is just "studying the material you don't know".

There is also "set up a pipeline to check audio file quality". Which is not ambitious as it is defined above, but it is perhaps equally difficult.

But, there is a short-term list for today:

Do one final "cleanse" of the wordlist
Re-enable the ability to have 6 or 8 multiple choice options available
Re-analyze the "audio" logic; on mobile devices there are some hover/selection-focus related issues causing audio truncation or double-plays.
Find the reason part of the screen isn't re-drawing properly after "New Word" activity in Journey Mode. ⚙️ all projects develop a lexicon. The "Mode" is the type of activity / question-selection algorithm, and the "Activity" is the individual question.

And, one additional question too complex for this, but still in-mind: how hard is it to convert this to a native app for iOS/Android? ⚙️ it would allow audio caching, and better on-device stats. But, it involves "interacting with the App Store". But also, could generate revenue.

tdf: day 5/21 Sports

2025-07-09 15:55:15

The first time trial is over, and the GC standings are recognizable.

They have been recognizable throughout; with cross-winds on day 1 and attacks by Pogacar/Vingegaard.

Tadej Pogacar - the odds-on favorite.
Remco Evenepoel - the odds-on favorite for third place.
Kevin Vauquelin
Jonas Vingegaard - the odds-on favorite for second place
Matteo Jorgenson
Mathieu Van Der Poel
Joao Almeida
Primoz Roglic
Florian Lipowitz

other than Van Der Poel, that could well be the top 8 finishers. Other contenders for the top 8 include Oscar Onley (11th), Enric Mas (13th), Tobias Johannessen (15th).

NBC has Phil Liggett, again. I had heard he had retired; apparently not. He makes viewing the race less enjoyable.

💡 and, I would prefer "listen-only" to "video-only", much of the time.

There have been no mountains of note, only hills. This is how the rouleur ⚙️ A rider who excels at stages that are neither sprint-finishes nor high-mountains of Van Der Poel is at the top of the standings.

Wout van Aert is not there. Partially because of bad-luck on the cross-wind ⚙️ Cross-winds limit the effect of "drafting" on riders, sometimes enough to stop the normal game-mechanic of the entire field coming in as one or two groups., partially because of bad form, and partially because he doesn't want to be. ⚙️ https://cyclinguptodate.com/cycling/im-not-going-to-beat-remco-anyway-wout-van-aert-ponders-if-its-worth-trying-to-go-for-time-trial-result describes most of these

the opposite of wrong is still wrong LLM

2025-07-06 18:45:41

Seen on the internet: https://a16z.com/geo-over-seo/

A new paradigm is emerging, one driven not by page rank, but by language models. We’re entering Act II of search: Generative Engine Optimization (GEO). ...

It’s no longer just about click-through rates, it’s about reference rates: how often your brand or content is cited or used as a source in model-generated answers. In a world of AI-generated outputs, GEO means optimizing for what the model chooses to reference, not just whether or where you appear in traditional search. That shift is revamping how we define and measure brand visibility and performance.

Already, new platforms like Profound, Goodie, and Daydream enable brands to analyze how they appear in AI-generated responses, track sentiment across model outputs, and understand which publishers are shaping model behavior. These platforms work by fine-tuning models to mirror brand-relevant prompt language, strategically injecting top SEO keywords, and running synthetic queries at scale. The outputs are then organized into actionable dashboards that help marketing teams monitor visibility, messaging consistency, and competitive share of voice.

Canada Goose used one such tool to gain insight into how LLMs referenced the brand — not just in terms of product features like warmth or waterproofing, but brand recognition itself. The takeaways were less about how users discovered Canada Goose, but whether the model spontaneously mentioned the brand at all, an indicator of unaided awareness in the AI era.

This kind of monitoring is becoming as important as traditional SEO dashboards. Tools like Ahrefs’ Brand Radar now track brand mentions in AI Overviews, helping companies understand how they’re framed and remembered by generative engines. Semrush also has a dedicated AI toolkit designed to help brands track perception across generative platforms, optimize content for AI visibility, and respond quickly to emerging mentions in LLM outputs, a sign that legacy SEO players are adapting to the GEO era.

This is a mix of cargo-cult marketing, and pure bullshit. The theory of the these companies is fatally flawed because of a few factors:

the models use a dataset that is 9-12 months old. Whatever changes these companies make, won't show up immediately.
there are no "traffic stats". The traffic stats that they provide have to be fake.
the 30 "companies" listed include a lot that seem fake ⚙️ https://www.limy.ai and https://relixir.ai are two that are just "somebody who started an idea at YCombinator 3 months ago, and don't actually have anything sellable yet. Key links like "pricing" and "features" don't exist.

https://davefriedman.substack.com/p/large-language-models-are-not-search

Unfortunately, this rebuttal is also wrong.

LLMs are not indexes. They are statistical models of language, trained on enormous corpora to predict token sequences. There is no top 10 list inside GPT-4 or Claude. There is only a tangled web of parameter weights encoding the probability that, given a prompt, certain tokens will follow. Trying to optimize your brand’s presence in that is like trying to guarantee your reflection in a kaleidoscope. ...

What’s more, the entire underlying substrate is profoundly unstable. Even minor prompt rephrasings can dramatically alter which brands get mentioned. Change the context window by 10 tokens, or adjust the system prompt’s tone, and you might collapse entirely different parts of the model’s probability distribution.

If you want your "ChatGPT ranking" to be better, questions like How does temperature, top-p sampling, and prompt framing alter our probabilistic surface area across different LLMs? don't actually matter.

The entire argument is flawed. It is just this is too complex to understand, random means anything can happen, oogie-boogie.

palanga, part 1 Cities

2025-06-23 16:53:16

the Trakaido project continues.

Hopefully today (or possibly tomorrow) it will be at a "release milestone".

The three main issues I am dealing with:

LLMs are bad at state machines, and React has a lot of them.
LLMs are bad at succinct UIs, and mobile apps need them.
The app was architected so most of the code was in Trakaido.jsx, and passed to the "Modes". But, with additional modes, it is now three-layers (Trakaido to Mode to Activity). And the functions being passed (like playAudio) should be imported directly by the activity. This refactor, somehow, is beyond the LLMs unaided capabilities.

So, after 90 minutes of trying and failing to get it to work, I reverted and am doing a more hands-on approach of re-factoring.

beware fools who yell at clouds LLM

2025-06-22 02:37:52

Seen on Substack ⚙️ https://substack.com/@tedgioia/note/c-128027920 :

Yesterday I did a Google search to identify the most successful movie of the last 5 years. Instead I got an AI response—which I didn’t ask for. AI identified a film from 2019 as the most popular movie in the last 5 years. And it even specified that the movie came out in 2019.

AI always serves up this garbage—it makes mistakes a child would easily avoid. And now tech CEOs want AI to rewrite history? It won’t even get the dates correct.

It gets worse, Google AI seems to think that the period from 2019 through 2024 is five years in duration. It can’t even count to five without making a mistake.

The period from 2019 to 2024 IS five years. Yes, the whole period would be six years. But if you want a period of 5 years that ends in May 2024, it is 2019-2024.

That Samaritan's entire schtick seems to be saying that AIs are garbage because they didn't make the same mistake that he made. And, that the solution is for more people to get an Oxford education like he did.

⚙️ https://www.honest-broker.com/p/5-ways-to-stop-ai-cheating

🔥 It is both a humblebrag and a completely stupid idea. "Use more Latin! Don't even allow typewriters!" His solution is to stick one's hand in the sands-of-the-past, and assume that this will fix everything. Because he's not a technologist, he's an old musician who thinks he knows everything but very clearly does not.

fuld v. plo US Constitution

2025-06-20 15:54:03

Fuld v. PLO ⚙️ https://www.supremecourt.gov/opinions/24pdf/24-20_f2bh.pdf is out. While I don't like the decision, it is hard to make an argument that they could have ruled any other way.

The PSJVTA’s personal jurisdiction provision does not violate the Fifth Amendment’s Due Process Clause because the statute reasonably ties the assertion of jurisdiction over the PLO and PA to conduct involving the United States and implicating sensitive foreign policy matters within the prerogative of the political branches.

The primary reason why the statute should fall is that the United States inherently doesn't have jurisdiction over a quasi-state entity halfway around the world. But, this isn't based in a concept of "due process". If anything, by asserting a right to due-process, the PLO does implicitly consent to jurisdiction, in a way that "not ceasing a policy" does not.

The "foreign policy" note is more concerning. There is a 🔥 far-right theory of government that would hold that the Constitution only governs how the federal government interacts with US citizens, and does not bind its external actions 💡 other than a few enumerated exceptions, such as "participating in the slave trade". I disagree with this; and I generally feel that the blanket exception various courts are working towards is a loophole large enough to drive a truck through.

But, the limits of American power will remain evident. A court can issue as many universal injunctions as it wants, but the PLO will not act based on it. And the ever-increasing fines based on foreign activity by a kangaroo-court will impugn the United States more than they will ever punish the PLO. When Russia fines Google an amount so large the TV announcer cannot pronounce it ⚙️ https://www.bbc.com/news/articles/cdxvnwkl5kgo , it is dismissed as the folly of a rogue state. 💡 Clarence Thomas, in his concurrence, spells it out more thoroughly. That Congress may override general principles of international law does not imply that it should, but instead that the relevant considerations are not constitutional ones. If you view international law as superseding the constitution on certain matters, it should not be surprising that the Constitution does not incorporate these restrictions.