Introduction to The Solr Enterprise Search Server
2007-04-02 16:31
597 查看
Solr in a Nutshell
Solr is a standalone enterprise search server with a web-services like API. You put documents in it (called "indexing") via XML over HTTP. You query it via HTTP GET and receive XML results.Advanced Full-Text Search Capabilities
Optimized for High Volume Web Traffic
Standards Based Open Interfaces - XML and HTTP
Comprehensive HTML Administration Interfaces
Scalability - Efficient Replication to other Solr Search Servers
Flexible and Adaptable with XML configuration
Extensible Plugin Architecture
Solr Uses the Lucene Search Library and Extends it!
A Real Data Schema, with Dynamic Fields, Unique KeysPowerful Extensions to the Lucene Query Language
Support for Dynamic Result Grouping and Filtering
Advanced, Configurable Text Analysis
Highly Configurable and User Extensible Caching
Performance Optimizations
External Configuration via XML
An Administration Interface
Monitorable Logging
Fast Incremental Updates and Snapshot Distribution
Detailed Features
Schema
Defines the field types and fields of documentsCan drive more intelligent processing
Declarative Lucene Analyzer specification
Dynamic Fields enables on-the-fly addition of fields
CopyField functionality allows indexing a single field multiple ways, or combining multiple fields into a single searchable field
Explicit types eliminates the need for guessing types of fields
External file-based configuration of stopword lists, synonym lists, and protected word lists
Query
HTTP interface with configurable response formats (XML/XSLT, JSON, Python, Ruby)Highlighted context snippets
Faceted Searching based on field values and explicit queries
Sort specifications added to query language
Constant scoring range and prefix queries - no idf, coord, or lengthNorm factors, and no restriction on the number of terms the query matches.
Function Query - influence the score by a function of a field's numeric value or ordinal
Performance Optimizations
Core
Pluggable query handlers and extensible XML data formatDocument uniqueness enforcement based on unique key field
Batches updates and deletes for high performance
User configurable commands triggered on index changes
Searcher concurrency control
Correct handling of numeric types for both sorting and range queries
Ability to control where docs with the sort field missing will be placed
Support for dynamic grouping of search results
Caching
Configurable Query Result, Filter, and Document cache instancesPluggable Cache implementations
Cache warming in background
When a new searcher is opened, configurable searches are run against it in order to warm it up to avoid slow first hits. During warming, the current searcher handles live requests.
Autowarming in background
The most recently accessed items in the caches of the current searcher are re-populated in the new searcher, enabing high cache hit rates across index/searcher changes.
Fast/small filter implementation
User level caching with autowarming support
Replication
Efficient distribution of index parts that have changed via rsync transportPull strategy allows for easy addition of searchers
Configurable distribution interval allows tradeoff between timeliness and cache utilization
Admin Interface
Comprehensive statistics on cache utilization, updates, and queriesText analysis debugger, showing result of every stage in an analyzer
Web Query Interface w/ debugging output
parsed query output
Lucene explain() document score detailing
explain score for documents outside of the requested range to debug why a given document wasn't ranked higher.
相关文章推荐
- An introduction to the Java 2 Platform, Enterprise Edition specification by way of BEA's WebLogic Server
- The Definitive Guide to SUSE Linux Enterprise Server
- Using Reporting Services to Search the SQL Server Log
- Introduction to the SQL Server Analysis Services Logistic Regression Data Mining Algorithm
- How to search the available space in SQL server.
- Introduction to COM Part II - Behind the Scenes of a COM Server
- A Brief Introduction to the JCo Server - Austin Sincock
- Search smarter with Apache Solr, Part 2: Solr for the enterprise
- Introduction to the Python Web Server Gateway Interface (WSGI)
- dbvis MySQL server version for the right syntax to use near 'OPTION SQL_SELECT_LIMIT=DEFAULT' at lin
- Hudson/Jenkins -- 消除svn警告: "clock of the subversion server appears to be out of sync"
- 两种解决异常:“The last packet sent successfully to the server was 0 milliseconds ago. ”的办法
- The origin server did not find a current representation for the target resource or is not willing to
- You have an error in your SQL syntax; check the manual that corresponds to your MySQL server
- flask_socket_io中报错RuntimeError: You need to use the eventlet server. See the Deployment section of the documentation for more information.的解决办法
- 创建存储过程错误(已解决):Error Code : 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server ver
- MySQL server version for the right syntax to use near 'type = InnoDB' at line 25
- How to release the port of TCP Client immediately when the connection is disconnect with the TCP server. - TCP 客户端与 TCP 服务器断开连接后
- SQL Agent Job 报“Access to the remote server is denied because the current security context is not trusted”
- Microsoft SQL Server 2008 R2 XML处理 One solution is to increase the number of characters retrieved from the server for XML data.