August 26, 2015
Quick Set-Up Guide for Applications in Solr-5
Solr is a standalone enterprise search server from the Apache Lucene project. It supports documents using JSON, XML, CSV or binary over HTTP. It provides the REST-based query and supports various formats like JSON, XML, CSV and binary.
Solr offers an easy-to-use platform on which applications can be installed and run in a few minutes.
Installing Solr
Prerequisites
Java 7 or greater should be installed.
Downloading Solr-5
Download and unpack the latest Solr release from the Apache download mirrors.
Unzipping
Begin by unzipping the Solr. Below is an example with a shell in UNIX:
/:$ ls solr* solr-5.1.0.zip /:$ unzip -q solr-5.1.0.zip /:$ cd solr-5.1.0/
Starting Solr
To launch Solr, run the "bin/solr" start code below:
/solr-5.1.0:$ bin/solr start Waiting to see Solr listening on port 8983 [/] Started Solr server on port 8983 (pid=3208). Happy searching with SOLR!
Solr runs by adding the Solr Admin UI in the main web browser. This is the main starting point for administering Solr.
To start Solr with a different port, use the code below:
/solr-5.1.0:$ bin/solr start –p 8984
Indexing Data
Once the Solr server is running, data must be inputted. Solr supports indexing-structured content in a variety of incoming formats such as XML, JSON, and CSV, with SolrXML being the predominant format. The XML files can be posted using the “bin/post” command, which will send the HTTP Post request to Solr as well as update the endpoint.
The “bin/post” command can be used to post the contents to Solr, which are automatically indexed by Solr.
bin/post -c [coreName] [path] e.g. bin/post -c gettingstarted example/exampledocs/*.xml
More Indexing Techniques include:
- Import records from a database using the Data Import Handler (DIH)
- Use SolrJ from JVM-based languages or other Solr clients to programmatically create documents to send to Solr
- Use the Admin UI core-specifics Documents tab to paste in a document to be indexed, or select Document Builder from the Document Type drop down to build a document one field at a time. Click on the Submit Document button below the form to index the document
Core
Create one or multiple cores to allow for individualized indexing and searching. With the help of the core, data can be indexed with different structures in the same server.
The following command can be used to create a core, which can later be used to add documents and start searching:
bin/solr create -c <name>
A property file inside the core folder (i.e. /server/<corename>/core.properties) gets created automatically with the above command. This file contains the following properties:
Property | Description |
name | The name of the SoleCore |
config | The configuration file name for a given core. If not defined, the default value for this field is "solrconfig.xml" |
schema | The schema file name for a given core. If not defined, the default value for this field is "schema.xml" |
dataDir | The data directory of the core where indexes are stored. The path of dataDir is relative to the path of core's instanceDir. If not defined, the default value is "data" |
configSet | The name of a defined Configuration Set, if desired, to use to configure the core |
properties | The name of the properties file for this core. The value can be an absolute pathname or a path relative to the value of instanceDir |
transient | If true, the core can be unloaded if Solr reaches the transientCacheSize. The default value, if not deined, is "false." Cores are unloaded in the order of the least recently used to the most recently used |
loadOnStartup | If true, the default, if it is not specified, will cause the core to load when Solr starts |
coreNodeName | Used only in SolrCloud, this is a unique identifier for the node hosting this replica. By default, a coreNodeName is generated automatically, but setting this attribute explicitly allows the user to manually assign a new core to replace an existing replica. For example: when replacing a machine that has had a hardware failure by restoring from backups on a new machine that has a new hostname or port |
ulogDir | The absolute or relative directory for the update log for this core (SolrCloud) |
shard | The shard assigned to this core (SolrCloud) |
collection | The name of the collection of which this core is a part (SolrCloud) |
roles | Future parameter for SolrCloud, or a way for users to mark nodes for their own use |
Adding "Documents"
Use the Solr "bin/post" command below to add the documents mentioned above
bin/post -c <corename> <path>
Adding "Documents" from Admin Console
Post the documents to Solr through the following interface, which provides the options to post the content based on content type, along with other properties.
Updating Data
The schema.xml file specifies the uniqueKey field called “id”. Whenever a command to add a document with the same value is entered, the server automatically replaces the data with the newly entered data. This can be seen by looking for the values of numDocs and Max document in the core specific overview section of the Solr Admin UI.
Deleting Data
Delete data by posting a “delete” command to the update URL and specifying the value of the document's unique key field, or a query that matches multiple documents. The command given below allows for a specific document to be deleted:
bin/post -c <corename> -d "<delete><id>id1</id></delete>"
Searching
Queries can be made to Solr using either REST clients, cURL, wget, Chrome, or POSTMAN; queries can also be made using native clients available in many programming languages.
The Solr Admin UI includes a query builder interface located in the “Getting Started” query tab. If the Execute Query button is clicked without anything in the form being changed, 10 documents in the JSON format will be provided (*:* in the q param matches all documents):
To use cURL, give the same URL in quotes on the curl command line:
curl "http://localhost:8983/solr/gettingstarted/select?q=*%3A*&wt=json&indent=true"
To search for a term, give it as the ‘q’ param value in the core-specific Solr Admin UI Query section, and replace *:* with the term to find (ex. “name:james”).
Shutdown
The following command can be used to stop the Solr server:
bin/solr stop -all ;
Solr-5 Resources
For more information on Solr, use the following resources:
- Solr Reference Guide (ensure the version of the reference guide matches with the version of Solr in use)
- Additional Resources
- Sample basic application with one core added