当前所在位置:珠峰网资料 >> 计算机 >> 软件水平 >> 正文
Java核心技术:Groovy编程—统计单词频率
发布时间:2013/2/28 14:33:32 来源:中华考试网 编辑:jack

  在搜索引擎,语音识别等领域常会统计单词的出现频率,下面给出Groovy实现,打印出现频率最高的6个单词以及相应的出现次数:

  def content =

  """

  The Java Collections API is the basis for all the nice support that Groovy gives you

  through lists and maps. In fact, Groovy not only uses the same abstractions, it

  even works on the very same classes that make up the Java Collections API.

  """

  def words = content.tokenize()

  def wordFrequency = [:]

  words.each {

  wordFrequency[it] = wordFrequency.get(it, 0 ) + 1

  }

  def wordList = wordFrequency.keySet().toList()

  wordList.sort {wordFrequency[it]}

  def result = ''

  wordList[ - 1 .. - 6 ].each {

  result += it.padLeft( 12 ) + " : " + wordFrequency[it] + " \n "

  }

  println result

  运行结果:

  the: 5

  Groovy: 2

  that: 2

  Collections: 2

  Java: 2

  same: 2

  如果所要处理的文本比较复杂,可以使用Regex进行处理,顺便说一句,Groovy在语言级别支持Regex!

广告合作:400-664-0084 全国热线:400-664-0084
Copyright 2010 - 2017 www.my8848.com 珠峰网 粤ICP备15066211号
珠峰网 版权所有 All Rights Reserved