红联Linux门户
Linux帮助

Apache Lucene 4.4发布,搜索引擎框架

发布时间:2013-07-24 09:02:50来源:红联作者:empast
Apache Lucene 4.4 发布了,包含很多 bug 修复、优化和提升,下载地址:

http://lucene.apache.org/core/mirrors-core-latest-redir.html

值得关注的改进有:

* 全新的 Replicator 模块,实现复制索引修订版的功能
* 新的 AnalyzingInfixSuggester
* 新的 PatternCaptureGroupTokenFilter: emit multiple tokens, one for each capture group in one or more Java regexes.

* 新的 Lucene Facet 模块特性:
* Added dynamic (no taxonomy index used) numeric range faceting (see
http://blog.mikemccandless.com/2013/05/dynamic-faceting-with-lucene.html )
* Arbitrary Querys are now allowed for per-dimension drill-down on
DrillDownQuery and DrillSideways, to support future dynamic faceting.
* New FacetResult.mergeHierarchies: merge multiple FacetResult of the
same dimension into a single one with the reconstructed hierarchy.

* FST's Builder can now handle more than 2.1 billion "tail nodes" while building a minimal FST.

* FieldCache Ints and Longs now use bit-packing to save memory. String fields have more efficient compression if there are many unique terms.

* Improved compression for NumericDocValues for dates and fields with very small numbers of unique values.

* New IndexWriter.hasUncommittedChanges(): returns true if there are changes that have not been committed.

* multiValuedSeparator in PostingsHighlighter is now configurable, for cases where you want a different logical separator between field values.

* NorwegianLightStemFilter and NorwegianMinimalStemFilter have been extended to handle "nynorsk".

* New ScandinavianFoldingFilter and ScandinavianNormalizationFilter.

* Easier compressed norms: Lucene42NormsFormat now takes an overhead parameter, allowing for values other than PackedInts.FASTEST.

* Analyzer now has an additional tokenStream(String fieldName, String text) method, so wrapping by StringReader for common use is no longer needed.

* New SimpleMergedSegmentWarmer: just ensures that data structures (terms, norms, docvalues, etc.) are initialized.

* IndexWriter flushes segments to the compound file format by default.

* Various bugfixes and optimizations since the 4.3.1 release.

完整列表请看下载压缩包中的 CHANGES.txt

来自:开源中国社区
文章评论

共有 0 条评论