【问题标题】:Getting a list of words from a Trie从 Trie 中获取单词列表
【发布时间】:2011-02-17 04:09:11
【问题描述】:

我希望使用以下代码不检查 Trie 中是否存在匹配的单词,而是返回以用户输入的前缀开头的所有单词的列表。有人可以指出我正确的方向吗?我根本无法让它工作.....

public boolean search(String s)
{
    Node current = root;
    System.out.println("\nSearching for string: "+s);

    while(current != null)
    {
        for(int i=0;i<s.length();i++)
        {               
            if(current.child[(int)(s.charAt(i)-'a')] == null)
            {
                System.out.println("Cannot find string: "+s);
                return false;
            }
            else
            {
                current = current.child[(int)(s.charAt(i)-'a')];
                System.out.println("Found character: "+ current.content);
            }
        }
        // If we are here, the string exists.
        // But to ensure unwanted substrings are not found:

        if (current.marker == true)
        {
            System.out.println("Found string: "+s);
            return true;
        }
        else
        {
            System.out.println("Cannot find string: "+s +"(only present as a substring)");
            return false;
        }
    }

    return false; 
}

}

【问题讨论】:

  • 这将有助于您定义 what 部分不起作用;考虑说出输入和预期输出是什么,然后向我们展示实际输出。
  • 这并不是说上面的代码不起作用,它可以很好地完成它的目的,即通知是否在 trie 中找到了字符串......我是什么我喜欢这样做,例如让用户输入“th”,并让上面的方法(修改后)从 trie 返回所有以“th”开头的单词。
  • @Woot4Moo:不,trie
  • 没有什么像有序的树。发音试试?
  • 这是正确的发音尝试,但来自单词 re"trie"val

标签: java trie


【解决方案1】:

我在尝试制作文本自动完成模块时遇到了这个问题。我通过制作一个 Trie 解决了这个问题,其中每个节点都包含它的父节点和子节点。首先,我从输入前缀开始搜索节点。然后我在 Trie 上应用了一个遍历,它以根作为前缀节点探索子树的所有节点。每当遇到叶节点时,就意味着找到了从输入前缀开始的单词的结尾。从该叶节点开始,我遍历父节点获取父节点,并到达子树的根。在这样做的同时,我一直在堆栈中添加节点的键。最后,我采用了前缀并开始通过弹出堆栈来附加它。我继续将单词保存在 ArrayList 中。在遍历结束时,我得到了从输入前缀开始的所有单词。这是带有用法示例的代码:

class TrieNode
{
    char c;
    TrieNode parent;
    HashMap<Character, TrieNode> children = new HashMap<Character, TrieNode>();
    boolean isLeaf;

    public TrieNode() {}
    public TrieNode(char c){this.c = c;}
}

-

public class Trie
{
    private TrieNode root;
    ArrayList<String> words; 
    TrieNode prefixRoot;
    String curPrefix;

    public Trie()
    {
        root = new TrieNode();
        words  = new ArrayList<String>();
    }

    // Inserts a word into the trie.
    public void insert(String word) 
    {
        HashMap<Character, TrieNode> children = root.children;

        TrieNode crntparent;

        crntparent = root;

        //cur children parent = root

        for(int i=0; i<word.length(); i++)
        {
            char c = word.charAt(i);

            TrieNode t;
            if(children.containsKey(c)){ t = children.get(c);}
            else
            {
            t = new TrieNode(c);
            t.parent = crntparent;
            children.put(c, t);
            }

            children = t.children;
            crntparent = t;

            //set leaf node
            if(i==word.length()-1)
                t.isLeaf = true;    
        }
    }

    // Returns if the word is in the trie.
    public boolean search(String word)
    {
        TrieNode t = searchNode(word);
        if(t != null && t.isLeaf){return true;}
        else{return false;}
    }

    // Returns if there is any word in the trie
    // that starts with the given prefix.
    public boolean startsWith(String prefix) 
    {
        if(searchNode(prefix) == null) {return false;}
        else{return true;}
    }

    public TrieNode searchNode(String str)
    {
        Map<Character, TrieNode> children = root.children; 
        TrieNode t = null;
        for(int i=0; i<str.length(); i++)
        {
            char c = str.charAt(i);
            if(children.containsKey(c))
            {
                t = children.get(c);
                children = t.children;
            }
            else{return null;}
        }

        prefixRoot = t;
        curPrefix = str;
        words.clear();
        return t;
    }


    ///////////////////////////


  void wordsFinderTraversal(TrieNode node, int offset) 
  {
        //  print(node, offset);

        if(node.isLeaf==true)
        {
          //println("leaf node found");

          TrieNode altair;
          altair = node;

          Stack<String> hstack = new Stack<String>(); 

          while(altair != prefixRoot)
          {
            //println(altair.c);
            hstack.push( Character.toString(altair.c) );
            altair = altair.parent;
          }

          String wrd = curPrefix;

          while(hstack.empty()==false)
          {
            wrd = wrd + hstack.pop();
          }

          //println(wrd);
          words.add(wrd);

        }

         Set<Character> kset = node.children.keySet();
         //println(node.c); println(node.isLeaf);println(kset);
         Iterator itr = kset.iterator();
         ArrayList<Character> aloc = new ArrayList<Character>();

       while(itr.hasNext())
       {
        Character ch = (Character)itr.next();  
        aloc.add(ch);
        //println(ch);
       } 

     // here you can play with the order of the children

       for( int i=0;i<aloc.size();i++)
       {
        wordsFinderTraversal(node.children.get(aloc.get(i)), offset + 2);
       } 

  }


 void displayFoundWords()
 {
   println("_______________");
  for(int i=0;i<words.size();i++)
  {
    println(words.get(i));
  } 
  println("________________");

 }



}//

例子

Trie prefixTree;

prefixTree = new Trie();  

  prefixTree.insert("GOING");
  prefixTree.insert("GONG");
  prefixTree.insert("PAKISTAN");
  prefixTree.insert("SHANGHAI");
  prefixTree.insert("GONDAL");
  prefixTree.insert("GODAY");
  prefixTree.insert("GODZILLA");

  if( prefixTree.startsWith("GO")==true)
  {
    TrieNode tn = prefixTree.searchNode("GO");
    prefixTree.wordsFinderTraversal(tn,0);
    prefixTree.displayFoundWords(); 

  }

  if( prefixTree.startsWith("GOD")==true)
  {
    TrieNode tn = prefixTree.searchNode("GOD");
    prefixTree.wordsFinderTraversal(tn,0);
    prefixTree.displayFoundWords(); 

  }

【讨论】:

    【解决方案2】:

    最简单的解决方案是使用depth-first search

    你顺着特里树,逐个字母匹配输入。然后,一旦您没有更多要匹配的字母,该节点下的所有内容都是您想要的字符串。递归地探索整个 subtrie,在你下到它的节点时构建字符串。

    【讨论】:

    • 如何在递归探索所有节点的同时构建字符串?我是 java 新手,所以我的第一个想法是通过引用更改对象,但是这对于 java 是不可能的。我能想到的唯一其他方法是使用全局变量。
    【解决方案3】:

    构建 Trie 后,您可以从找到前缀的节点开始进行 DFS:

    Here Node is Trie node, word=till now found word, res = list of words
    
    def dfs(self, node, word, res):
        # Base condition: when at leaf node, add current word into our list
        if EndofWord at node: 
            res.append(word)
            return
        # For each level, go deep down, but DFS fashion 
        # add current char into our current word.
        for w in node:
            self.dfs(node[w], word + w, res)
    

    【讨论】:

      【解决方案4】:

      在我看来,这更容易递归解决。它会是这样的:

      1. 编写一个递归函数Print,它会打印以您作为参数提供的节点为根的树中的所有节点。 Wiki 告诉您如何执行此操作(查看排序)。
      2. 查找前缀的最后一个字符,以及用该字符标记的节点,从 trie 中的根开始向下。以该节点为参数调用Print函数。然后只需确保在每个单词之前也输出前缀,因为这将为您提供所有没有前缀的单词。

      如果你不太关心效率,你可以在主根节点上运行Print,只打印那些以你感兴趣的前缀开头的单词。这更容易实现但速度较慢。

      【讨论】:

        【解决方案5】:

        您需要从您找到的前缀节点开始遍历子树。

        以同样的方式开始,即找到正确的节点。然后,不是检查它的标记,而是遍历该树(即遍历它的所有后代;DFS 是一个很好的方法),保存用于从第一个节点到达“当前”节点的子字符串。

        如果当前节点被标记为单词,则输出*前缀+到达的子字符串。

        * 或将其添加到列表或其他内容中。

        【讨论】:

          【解决方案6】:

          我为 ITA 的一个谜题构建了一次尝试

          public class WordTree {
          
          
          class Node {
          
              private final char ch;
          
              /**
               * Flag indicates that this node is the end of the string.
               */
              private boolean end;
          
              private LinkedList<Node> children;
          
              public Node(char ch) {
                  this.ch = ch;
              }
          
              public void addChild(Node node) {
                  if (children == null) {
                      children = new LinkedList<Node>();
                  }
                  children.add(node);
              }
          
              public Node getNode(char ch) {
                  if (children == null) {
                      return null;
                  }
                  for (Node child : children) {
                      if (child.getChar() == ch) {
                          return child;
                      }
                  }
                  return null;
              }
          
              public char getChar() {
                  return ch;
              }
          
              public List<Node> getChildren() {
                  if (this.children == null) {
                      return Collections.emptyList();
                  }
                  return children;
              }
          
              public boolean isEnd() {
                  return end;
              }
          
              public void setEnd(boolean end) {
                  this.end = end;
              }
          }
          
          
          Node root = new Node(' ');
          
          public WordTree() {
          }
          
          /**
           * Searches for a strings that match the prefix.
           *
           * @param prefix - prefix
           * @return - list of strings that match the prefix, or empty list of no matches are found.
           */
          public List<String> getWordsForPrefix(String prefix) {
              if (prefix.length() == 0) {
                  return Collections.emptyList();
              }
              Node node = getNodeForPrefix(root, prefix);
              if (node == null) {
                  return Collections.emptyList();
              }
              List<LinkedList<Character>> chars = collectChars(node);
              List<String> words = new ArrayList<String>(chars.size());
              for (LinkedList<Character> charList : chars) {
                  words.add(combine(prefix.substring(0, prefix.length() - 1), charList));
              }
              return words;
          }
          
          
          private String combine(String prefix, List<Character> charList) {
              StringBuilder sb = new StringBuilder(prefix);
              for (Character character : charList) {
                  sb.append(character);
              }
              return sb.toString();
          }
          
          
          private Node getNodeForPrefix(Node node, String prefix) {
              if (prefix.length() == 0) {
                  return node;
              }
              Node next = node.getNode(prefix.charAt(0));
              if (next == null) {
                  return null;
              }
              return getNodeForPrefix(next, prefix.substring(1, prefix.length()));
          }
          
          
          private List<LinkedList<Character>> collectChars(Node node) {
              List<LinkedList<Character>> chars = new ArrayList<LinkedList<Character>>();
          
              if (node.getChildren().size() == 0) {
                  chars.add(new LinkedList<Character>(Collections.singletonList(node.getChar())));
              } else {
                  if (node.isEnd()) {
                      chars.add(new LinkedList<Character> 
                      Collections.singletonList(node.getChar())));
                  }
                  List<Node> children = node.getChildren();
                  for (Node child : children) {
                      List<LinkedList<Character>> childList = collectChars(child);
                      for (LinkedList<Character> characters : childList) {
                          characters.push(node.getChar());
                          chars.add(characters);
                      }
                  }
              }
              return chars;
          }
          
          
          public void addWord(String word) {
              addWord(root, word);
          }
          
          private void addWord(Node parent, String word) {
              if (word.trim().length() == 0) {
                  return;
              }
              Node child = parent.getNode(word.charAt(0));
              if (child == null) {
                  child = new Node(word.charAt(0));
                  parent.addChild(child);
              } if (word.length() == 1) {
                  child.setEnd(true);
              } else {
                  addWord(child, word.substring(1, word.length()));
              }
          }
          
          
          public static void main(String[] args) {
              WordTree tree = new WordTree();
              tree.addWord("world");
              tree.addWord("work");
              tree.addWord("wolf");
              tree.addWord("life");
              tree.addWord("love");
              System.out.println(tree.getWordsForPrefix("wo"));
          }
          

          }

          【讨论】:

            【解决方案7】:

            您需要使用列表
            List<String> myList = new ArrayList<String>();
            if(matchingStringFound)
            myList.add(stringToAdd);

            【讨论】:

              【解决方案8】:

              在你的 for 循环之后,添加对 printAllStringsInTrie(current, s); 的调用

              void printAllStringsInTrie(Node t, String prefix) {
                if (t.current_marker) System.out.println(prefix);
                for (int i = 0; i < t.child.length; i++) {
                  if (t.child[i] != null) {
                    printAllStringsInTrie(t.child[i], prefix + ('a' + i));  // does + work on (String, char)?
                  }
                }
              }
              

              【讨论】:

                【解决方案9】:

                下面的递归代码可以用在你的 TrieNode 是这样的地方: 此代码运行良好。

                TrieNode(char c)
                {
                
                        this.con=c;
                        this.isEnd=false;
                        list=new ArrayList<TrieNode>();
                        count=0;
                
                }
                
                //--------------------------------------------------
                
                public void Print(TrieNode root1, ArrayList<Character> path)
                {
                
                      if(root1==null)
                          return;
                
                      if(root1.isEnd==true)
                      {
                          //print the entire path
                          ListIterator<Character> itr1=path.listIterator();
                          while(itr1.hasNext())
                          {
                              System.out.print(itr1.next());
                          }
                          System.out.println();
                          return;
                      }
                      else{
                          ListIterator<TrieNode> itr=root1.list.listIterator();
                          while(itr.hasNext())
                          {
                              TrieNode child=itr.next();
                              path.add(child.con);
                              Print(child,path);
                              path.remove(path.size()-1);
                
                            }
                      }
                

                【讨论】:

                  【解决方案10】:

                  简单的递归 DFS 算法可用于查找给定前缀的所有单词。

                  示例 Trie 节点:

                  static class TrieNode {
                      Map<Character, TrieNode> children = new HashMap<>();
                      boolean isWord = false;
                  }
                  

                  查找给定前缀的所有单词的方法:

                  static List<String> findAllWordsForPrefix(String prefix, TrieNode root) {
                      List<String> words = new ArrayList<>();
                      TrieNode current = root;
                      for(Character c: prefix.toCharArray()) {
                          TrieNode nextNode = current.children.get(c);
                          if(nextNode == null) return words;
                          current = nextNode;
                      }
                      if(!current.children.isEmpty()) {
                          findAllWordsForPrefixRecursively(prefix, current, words);
                      } else {
                          if(current.isWord) words.add(prefix);
                      }
                      return words;
                  }
                  
                  static void findAllWordsForPrefixRecursively(String prefix, TrieNode node, List<String> words) {
                      if(node.isWord) words.add(prefix);
                      if(node.children.isEmpty()) {
                          return;
                      }
                      for(Character c: node.children.keySet()) {
                          findAllWordsForPrefixRecursively(prefix + c, node.children.get(c), words);
                      }
                  }
                  

                  完整的代码可以在下面找到: TrieDataStructure Example

                  【讨论】:

                    猜你喜欢
                    • 1970-01-01
                    • 2011-10-23
                    • 1970-01-01
                    • 1970-01-01
                    • 1970-01-01
                    • 1970-01-01
                    • 1970-01-01
                    • 1970-01-01
                    • 1970-01-01
                    相关资源
                    最近更新 更多