LeetCode 820 單詞的壓縮編碼

Mondo 科技更新 2024-01-30

給定乙個單詞列表，我們將列表編碼為索引字串 s 和索引列表 a。

例如，如果此列表為 ["time", "me", "bell"]，我們可以將其表示為 s ="time#bell#"和索引 = [0， 2， 5]。

對於每個索引，我們可以從索引在字串 s 中的位置開始讀取字串，直到"#"以恢復我們之前的單詞列表。

那麼，成功編碼給定單詞列表的最小字串長度是多少？

示例：輸入：words = [.]"time", "me", "bell"]

輸出： 10

描述： s ="time#bell#" ， indexes = [0, 2, 5] 。

提示：1 < = 單詞length <= 2000

1 <= words[i].length <= 7

每個單詞都是小寫字母。

看完問題後，阿里位元組發現，如果列表中的每個單詞都顛倒過來，那就是字尾樹問題。例如：["time", "me", "bell"]相反的順序後跟 ["emit", "em", "lleb"我們要求的結果僅此而已"emit"+ 的長度"llem"+ 的長度"##"（em 和 emit 有乙個共同的字首，只需計算乙個）。

因此，直觀的想法是使用字首樹+反向插入的形式來模擬字尾樹。

下面的**看起來很複雜，但我將這個模板用於許多主題，並且我可以對細節進行一些調整。

字首樹的主要 API 如下：

insert(word)：插入單詞search(word)：查詢單詞是否存在startwith(word)：找出是否有以word為字首的單詞，其中startwith是字首樹的核心用法，其名稱字首樹由此而來。您可以從 208 個問題開始，在嘗試其他問題之前熟悉字首樹。

字首樹如下所示：

如圖所示，每個節點儲存乙個字元，然後新增乙個控制資訊來指示是否為單詞的結尾，實際使用過程可能略有不同，但變化不大。

這個問題需要考慮邊緣情況，例如，這個列表是["time", "time", "me", "bell"在這種情況下，對於重複元素，這裡我使用雜湊集來刪除重複資料。

字首樹已去除重複資料。

class trie: def __init__(self): """ initialize your data structure here. """ self.trie = {}def insert(self, word): """ inserts a word into the trie. :type word: str :rtype: void """ curr = self.trie for w in word: if w not in curr: curr[w] = {}curr = curr[w] curr['#'] = 1 def search(self, word): """ returns if the word is in the trie. :type word: str :rtype: bool """ curr = self.trie for w in word: curr = curr[w] # len(curr) == 1 means we meet '## when we search 'em'(which reversed from 'me') # the result is len(curr) >1 # cause the curr look like } return len(curr) == 1class solution: def minimumlengthencoding(self, words: list[str]) int: trie = trie() cnt = 0 words = set(words) for word in words: trie.insert(word[::1]) for word in words: if trie.search(word[::1]):cnt += len(word) +1 return cnt

複雜性分析時間複雜度：$o（n）$，其中n為字長列表中的總字元數，如["time", "me"]，即 4 + 2 = 6。空間複雜度：$o（n）$，其中 n 是字長列表中的字元總數，例如 ["time", "me"]，即 4 + 2 = 6。

LeetCode 820 單詞的壓縮編碼

相關問題答案

如何快速記憶單詞及其含義

一種記憶單詞的有效方法

一種記憶單詞的有效方法

654個單詞的邏輯推理上下文在記憶組中滾動擴充套件記憶

英文單詞deadline的中文字義，如何搭配和如何使用，給大家一些例句的用法