{"channel":"cities","content":"yesterday:\r\n> three new \"greenland\" benchmarks: letter count, unit conversion, part-of-speech detection. (<red> it should not be a surprise to the contemporary reader that the models struggle the most with \"letter count\" - how many \"r\"s are in strawberry.)\r\n\r\nstill to do:\r\n> code cleanup (<red> the \"run\" method is written in slightly different form *seven* times)\r\n> more benchmarks\r\n> dashboard UI improvements","created_at":"2025-03-28T13:41:49.550656","id":331,"llm_annotations":{},"parent_id":329,"processed_content":"<p>yesterday:\r</p>\n<ul>\n<li class=\"arrow-list\"> three new \"greenland\" benchmarks: letter count, unit conversion, part-of-speech detection. <span class=\"colorblock color-red\">\n    <span class=\"sigil\">\ud83d\udca1</span>\n    <span class=\"colortext-content\">( it should not be a surprise to the contemporary reader that the models struggle the most with \"letter count\" - how many \"r\"s are in strawberry.)</span>\n  </span>\r</li>\n</ul>\n<p>still to do:\r</p>\n<ul>\n<li class=\"arrow-list\"> code cleanup <span class=\"colorblock color-red\">\n    <span class=\"sigil\">\ud83d\udca1</span>\n    <span class=\"colortext-content\">( the \"run\" method is written in slightly different form <em>seven</em> times)</span>\n  </span>\r</li>\n<li class=\"arrow-list\"> more benchmarks\r</li>\n<li class=\"arrow-list\"> dashboard UI improvements</li>\n</ul>","quotes":[],"subject":"minot (part 4)"}