I spent a couple of years using Apache Lucene and Solr at a search and data company.
The following were different aspects of the indexing process:
- Mode of submitting documents
-- Copying files in a location
-- via URL(SOAP/REST POST)
-- via API's more the better to allow multiple applications to submit seamlessly
- Scale, frequency and scalability of ingesting the submitted documents
--- Some of the sources can be light and some can really load the indexing mechanism.
--- A horizontally scaling architecture is imperative to scale out as your load grows.
- Another important factor is the commit overhead, frequency and configuration
The commit makes the newly indexed documents available in searches so it is very important to customize it based on your needs. How quickly do you want to see the results reflecting the newly indexed documents?
The versions and products we were using had big overhead during commit so we were doing it once a day. And sometimes on demand as needed.
Out of all the above the most challenging one and hence the most important for me would be the commit overhead.
Marketing and Communications Manager for Lookeen.com at Axonic Informationssysteme GmbH
Vendor
2016-02-10T12:04:43Z
Feb 10, 2016
At Lookeen the most common questions or concerns are:
-How fast does it index and is it customizable
-How often can it index and is it customizable
-Types of files/data that it can index
For enterprise customers:
-Is support offered
-How easy is it to rollout (with references or case studies)
-Can the index be shared across the Enterprise (Scalability as Aditya already said)
-Are the search returns accurate and can you drill down into those search returns
-Can you also search email
-Will the user interface be easy enough for new users
-How light is the indexing on resources
-Can it be used on Virtualized machines
I'm sure there are a lot more factors that I'm forgetting, but that has been my experience.
Indexing and search tools require several key features in order to be considered a quality option for purchase. The PeerSpot community discussed some the aspects that they considered most important: the speed and frequency of indexing rated high on the list, as did customization of the tool's features (specifically overhead customization). Additionally the breadth and depth of indexing that it is capable of doing will tell a lot about an indexing/search tool. Other features mentioned include...
I work for a stock company. Certainly automatization and ROI. :)
I spent a couple of years using Apache Lucene and Solr at a search and data company.
The following were different aspects of the indexing process:
- Mode of submitting documents
-- Copying files in a location
-- via URL(SOAP/REST POST)
-- via API's more the better to allow multiple applications to submit seamlessly
- Scale, frequency and scalability of ingesting the submitted documents
--- Some of the sources can be light and some can really load the indexing mechanism.
--- A horizontally scaling architecture is imperative to scale out as your load grows.
- Another important factor is the commit overhead, frequency and configuration
The commit makes the newly indexed documents available in searches so it is very important to customize it based on your needs. How quickly do you want to see the results reflecting the newly indexed documents?
The versions and products we were using had big overhead during commit so we were doing it once a day. And sometimes on demand as needed.
Out of all the above the most challenging one and hence the most important for me would be the commit overhead.
At Lookeen the most common questions or concerns are:
-How fast does it index and is it customizable
-How often can it index and is it customizable
-Types of files/data that it can index
For enterprise customers:
-Is support offered
-How easy is it to rollout (with references or case studies)
-Can the index be shared across the Enterprise (Scalability as Aditya already said)
-Are the search returns accurate and can you drill down into those search returns
-Can you also search email
-Will the user interface be easy enough for new users
-How light is the indexing on resources
-Can it be used on Virtualized machines
I'm sure there are a lot more factors that I'm forgetting, but that has been my experience.