您的位置：首页 > 其它

29 同位词的统计

2016-01-23 20:33 453 查看

前言

本博文部分图片, 思路来自于剑指offer 或者编程珠玑

问题描述

思路

同位词 : 如果一个单词可以通过交换任意字符的位置进行重新组合得到另外一个单词, 那么这两个单词互为同位词

思路 : 对于该问题, 我们可以发现每一组同位词有一个相同的地方, 组成他们的字符的字符大小, 字符个数都相同, 这就是我们的突破口, 我们可以为每一个单词定义一个label, 该label为该字符串中的所有字符进行排序之后的结果

比如 : “dfs” 的label为”dfs”, “sfd” 的label为”dfs”

然后遍历给定输入的各个单词, 进行统计, 将label相同的单词放入相同过的容器, 详细过程详见代码

参考代码

[code]/**
 * file name : Test27IsAnagram.java
 * created at : 9:12:40 PM Jun 11, 2015
 * created by 970655147
 */

package com.hx.test05;

public class Test27IsAnagram {

    // 解析同位词
    public static void main(String []args) {

        String path = System.getProperty("user.dir") + "\\tmp\\Test27IsAnagram.txt";

        try {

            resolveAnagram(path);

        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    // 每一个同位词集合的初始容量
    static int INTIAILZE_SIZE = 4;

    // 解析同位词, 为每一个同位词生成一个标签, 然后将条目添加到dict中
    // 思路 : 解析出输入的每一个单词, 然后创建一个key为 单词以英文排序, value为该单词的条目, 然后添加到dict中
    public static void resolveAnagram(String path) throws IOException {
        BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(path)) );
        Map<String, HashSet<String>> dict = new HashMap<String, HashSet<String>>();

        try {
            String line = null;
            while((line = br.readLine()) != null) {
                if(line == null || line.trim().length() == 0) {
                    continue ;
                }

                line = prepare(line);
                String[] words = line.split("\\s+");
                for(String word : words) {
                    putWordIntoDict(word, dict);
                }
            }
        } finally {
            if(br != null) {
                br.close();
            }
        }

        for(Map.Entry<String, HashSet<String>> entry : dict.entrySet()) {
            Log.log(entry.getKey() + " -> " + entry.getValue() );
        }
    }

    // 将word放入dict
        // 先获取word对应的label, 然后将其添加到dict中
    private static void putWordIntoDict(String word, Map<String, HashSet<String>> dict) {
        char[] chars = word.toCharArray();
        Arrays.sort(chars);
        String label = new String(chars);
        HashSet<String> tuple = dict.get(label);
        if(tuple == null) {
            tuple = new HashSet<String>(INTIAILZE_SIZE);
            dict.put(label, tuple);
        }
        tuple.add(word);        
    }

    // 常量数据
    static char COMMON = ',';
    static char DOT = '.';
    static char QUES = '?';
    static char INV_REF = '\'';
    static char LINK = '-';
    static char SPACE = ' ';
    static char A = 'A', Z = 'Z';
    static int CAPITAL_MASK = 1 << 5;

    // 预处理line, 替换",", "."等 为空格, 将大写转换为小写
    private static String prepare(String line) {
        char[] chars = line.toCharArray();
        for(int i=0; i<chars.length; i++) {
            if(chars[i] == COMMON || chars[i] == DOT || chars[i] == QUES) {
                chars[i] = SPACE;
            }
//          if(chars[i] == INV_REF) {
//              chars[i] = LINK;
//          }
            if(chars[i] >= A && chars[i] <= Z) {
                chars[i] |= CAPITAL_MASK;
            }
        }

        return new String(chars);
    }

}

效果截图

总结

从这个案例中可以看出, 对于数据的性质的分析, 对于设计算法来说是很重要的

下面我贴一下编程珠玑上面, 关于这个问题, 关于置换各个字母的方式的解析的一段图片

注 : 因为作者的水平有限，必然可能出现一些bug, 所以请大家指出！

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航