{"channel":"cities","content":"In Lithuanian, the term for \"Lithuanian language\" is << Lietuvi\u0173 kalba >>, for \"Lithuanian man\" is << lietuvis >>, and for \"Lithuanian woman\" is << lietuv\u0117 >>.\r\n\r\nThese are all the same word in English, << Lithuanian >>.  Here, we have separate files (<xantham> or \"grammatical form\") for \"nationality\" and \"language\".\r\n\r\n<red> with 200 nationalities and 400 languages, it is fine to have a \"special case\" for each.  I am expecting 100-200 \"special cases\", including 10-20 that are catch-alls.\r\n\r\n----\r\n\r\nThe thought of having \"Derivative Forms\" shared between languages is, unfortunately, impossible.\r\n\r\nSo, the \"lemma\" form (infinitive, etc.) will have to be translated.  This, still, is difficult.  Is it << la piscine >> or just << piscine >>?  Or, for that matter, \"pool\" or \"swimming pool\" (or << pool (swimming) >>).\r\n\r\nBut, then, there will be separate derivative forms for \"walks\", \"walked\", etc., all of which are mono-lingual.  For << marchons >>, << marchez >>, etc. it will be a separate set.\r\n\r\n----\r\n\r\nFor (conjunctions, prepositions, etc.) I want to still call them \"grammatical words\".\r\n\r\nBecause a word-for-word \"translation\" is too perilous to attempt.\r\n\r\n--MORE--\r\n\r\nIn practical terms, for verbs, Greenland output will have a pythondict like:\r\n\r\n<<<\r\nV0023 = {\r\n\"guid\": \"V0023\",\r\n\"english\": \"to eat\",\r\n\"lithuanian\": \"valgyti\",\r\n\"english_forms\" = {\"first-person-singular-present\": \"I eat\", ...}\r\n\"lithuanian_forms\" = {\"first-person-singular-present\": \"A\u0161 valgau\", ...} ...\r\n>>>\r\n\r\nThis is similar enough to the current Trakaido format:\r\n\r\n<<< \r\n\"valgyti\": {\r\n    \"english\": \"to eat\",\r\n    \"present_tense\": {\r\n      \"1s\": {\"english\": \"I eat\", \"lithuanian\": \"a\u0161 valgau\"},\r\n      \"2s\": {\"english\": \"you(s.) eat\", \"lithuanian\": \"tu valgai\"},\r\n      \"3s-m\": {\"english\": \"he eats\", \"lithuanian\": \"jis valgo\"}, ...\r\n>>>\r\n\r\nFor both, it will request \"valgyti / first-person-singular-present\" as a \"flashcard\".\r\n\r\n----\r\n\r\nThe new approach will simplify some of the \"more animals\" style groups.\r\n\r\nAll the animals will be in a single \"dictionary\" file.  The first 12 will be in \"Animals 1\", the next 18 in \"Animals 2\", etc.\r\n\r\n<red> this will *probably* come with a reworking of the \"corpus\" system.  We mostly want \"levels\" now anyway ... and \"decoy sets\" can be configured separately.","created_at":"2025-07-24T21:52:37.965355","id":636,"llm_annotations":{},"parent_id":635,"processed_content":"<p>In Lithuanian, the term for \"Lithuanian language\" is <span class=\"literal-text\">Lietuvi\u0173 kalba</span>, for \"Lithuanian man\" is <span class=\"literal-text\">lietuvis</span>, and for \"Lithuanian woman\" is <span class=\"literal-text\">lietuv\u0117</span>.\r</p>\n<p>These are all the same word in English, <span class=\"literal-text\">Lithuanian</span>.  Here, we have separate files <span class=\"colorblock color-xantham\"><span class=\"sigil\">\ud83d\udd25</span><span class=\"colortext-content\"> or \"grammatical form\"</span></span> for \"nationality\" and \"language\".\r</p>\n<p><span class=\"colorblock color-red\"><span class=\"sigil\">\ud83d\udca1</span><span class=\"colortext-content\"> with 200 nationalities and 400 languages, it is fine to have a \"special case\" for each.  I am expecting 100-200 \"special cases\", including 10-20 that are catch-alls.\r</span></span></p>\n<hr class=\"section-break\" />\n<p>The thought of having \"Derivative Forms\" shared between languages is, unfortunately, impossible.\r</p>\n<p>So, the \"lemma\" form (infinitive, etc.) will have to be translated.  This, still, is difficult.  Is it <span class=\"literal-text\">la piscine</span> or just <span class=\"literal-text\">piscine</span>?  Or, for that matter, \"pool\" or \"swimming pool\" (or <span class=\"literal-text\">pool (swimming)</span>).\r</p>\n<p>But, then, there will be separate derivative forms for \"walks\", \"walked\", etc., all of which are mono-lingual.  For <span class=\"literal-text\">marchons</span>, <span class=\"literal-text\">marchez</span>, etc. it will be a separate set.\r</p>\n<hr class=\"section-break\" />\n<p>For (conjunctions, prepositions, etc.) I want to still call them \"grammatical words\".\r</p>\n<p>Because a word-for-word \"translation\" is too perilous to attempt.\r</p>\n<div class=\"content-sigil\" aria-label=\"Extended content begins here\">&#9135;&#9135;&#9135;&#9135;&#9135;</div>\n<p>In practical terms, for verbs, Greenland output will have a pythondict like:\r</p>\n<div class=\"mlq\"><button type=\"button\" class=\"mlq-collapse\" aria-label=\"Toggle visibility\"><span class=\"mlq-collapse-icon\">-</span></button><div class=\"mlq-content\"><p>V0023 = {\r</p>\n<p>\"guid\": \"V0023\",\r</p>\n<p>\"english\": \"to eat\",\r</p>\n<p>\"lithuanian\": \"valgyti\",\r</p>\n<p>\"english_forms\" = {\"first-person-singular-present\": \"I eat\", ...}\r</p>\n<p>\"lithuanian_forms\" = {\"first-person-singular-present\": \"A\u0161 valgau\", ...} ...\r</p></div></div>\n<p>This is similar enough to the current Trakaido format:\r</p>\n<div class=\"mlq\"><button type=\"button\" class=\"mlq-collapse\" aria-label=\"Toggle visibility\"><span class=\"mlq-collapse-icon\">-</span></button><div class=\"mlq-content\"><p>\"valgyti\": {\r</p>\n<p>    \"english\": \"to eat\",\r</p>\n<p>    \"present_tense\": {\r</p>\n<p>      \"1s\": {\"english\": \"I eat\", \"lithuanian\": \"a\u0161 valgau\"},\r</p>\n<p>      \"2s\": {\"english\": \"you(s.) eat\", \"lithuanian\": \"tu valgai\"},\r</p>\n<p>      \"3s-m\": {\"english\": \"he eats\", \"lithuanian\": \"jis valgo\"}, ...\r</p></div></div>\n<p>For both, it will request \"valgyti / first-person-singular-present\" as a \"flashcard\".\r</p>\n<hr class=\"section-break\" />\n<p>The new approach will simplify some of the \"more animals\" style groups.\r</p>\n<p>All the animals will be in a single \"dictionary\" file.  The first 12 will be in \"Animals 1\", the next 18 in \"Animals 2\", etc.\r</p>\n<p><span class=\"colorblock color-red\"><span class=\"sigil\">\ud83d\udca1</span><span class=\"colortext-content\"> this will <em>probably</em> come with a reworking of the \"corpus\" system.  We mostly want \"levels\" now anyway ... and \"decoy sets\" can be configured separately.</span></span></p>","quotes":[],"subject":"lake andes, part 2"}
