by garysieling on 12/3/19, 6:26 PM with 55 comments
by crawdog on 12/3/19, 7:42 PM
The killer feature I haven't seen with many of these solutions is easy, out of the box integration with internal systems (Atlassian Confluence, JIRA, Remedy, SharePoint, FileSystem, Intranet). When you have a SaaS search engine it's difficult to export that data... Even worse to secure it. Ironically, Plumtree Software (bought by BEA -> Oracle) had all of this in their product in 2001. What's old is new again... Those features are prime for a comeback.
I think this is a space where Elastic can do well with an on-prem or managed cloud offering that is "behind the firewall", integrated with customer's environment. Add in term vector search support, ML for document/query understanding, and integration with customer's security model (Active Directory) and it would be compelling.
by whitezebra on 12/3/19, 9:07 PM
We'd love to talk to you if you're interested in using Kendra. We're also wondering if there's more value on the Question Answering side of things, or the document retrieval side of things? Would love your thoughts!
by MediumD on 12/3/19, 9:27 PM
Building a similar enterprise search product at http://landria.io/ that has a lot of additional features & enhancements over a unified keyword index + ML.
We also have a terraform config if you would like to boot it up within your own private cloud!
Any feedback would be great appreciated
by cj on 12/3/19, 7:09 PM
> Kendra’s preview will not include incremental learning, query auto-completion, custom synonyms, or analytics. The preview will only offer connectors for SharePoint online, JDBC, and Amazon S3. It will be limited to a maximum of 40k queries per day, 100k documents indexed, and one index per account.
by msoad on 12/3/19, 9:32 PM
Hopefully Amazon moves faster and offers more out of the box data sources. They are missing G Suite content that a lot of orgs are relying on these days. Would be interesting to see what's their strategy there.
by tchalla on 12/3/19, 7:26 PM
kendra (IndE)
noun C
a centre for some activity (research, study, business, art, etc.)
by citilife on 12/3/19, 7:16 PM
The main issue is giving access to documents, which most Enterprise customers do not want to do... Further, most info is in employees heads, not in documentation.
by aerovistae on 12/3/19, 7:09 PM
by CodeSheikh on 12/3/19, 7:34 PM
by davchana on 12/4/19, 2:06 AM
by joeAtBiome on 12/4/19, 4:25 AM
If you are interested in a search solution like Biome, please feel free to reach out so we can talk more and learn the best way we can empower your team to be more productive.
by collsni on 12/4/19, 5:00 AM
by stepstep1 on 12/4/19, 6:18 PM
by hooloovoo_zoo on 12/3/19, 7:20 PM
by stepstep1 on 12/4/19, 6:19 PM
by lovelearning on 12/4/19, 6:52 AM
What's good:
==========
- Focused search for question and answer databases (such as customer FAQs)
- ML-based semantic search without requiring any explicit configuration
- Connectors for S3, AWS-hosted MySQL/PG, Sharepoint. Searching data already in the AWS ecosystem (S3, Aurora) is now easier, and likely faster and cheaper too in some aspects like saving incoming/outgoing bandwidth
- Document-level access control at all pricing plans
- Managed search (similar to Algolia)
What's similar to existing search systems (Solr / ES / Algolia):
==========
- Indexing: All data has to be processed into "field:value" structure prior to indexing
- Indexing file formats: Plain text, HTML, PDF, MS DOCX, MS PPT
- Searching: Usual boolean filters and faceting but only at field level.
- Searching: Field and value boosts for relevance, but only at index-time
- Results: Highlighting support
What's missing:
===========
- No multi-lingual support. Only English. Given that it's AWS, I'm very surprised by this actually (or I've missed out something in their docs)
- Can't configure text analysis for English. I feel this'll return relevant results for formal-style content, but probably not for informal-style content like emails.
- No connectors for common internal systems: Outlook, JIRA, Confluence
- No built-in support for CSV, XLS, JSON (that one's odd!). They'll all require preprocessing which means additional infra costs.
- Doesn't seem to support range- / query- facets. I feel lack of range facets is a big problem, especially for numerical data.
- No query-time relevance tuning
- No field-level access control
- Scores are not returned in results
- Common post-searching functionality is missing: rescoring, grouping, clustering
What's unknown:
============
- I don't see any information about phrase or proximity searches. Of course, they are usually relevance hacks in keyword-based systems, but sometimes users really need exact phrase matches. Does their ML backend handle this somehow?
- All search systems fall short while handling proper nouns - names, places, things, scientific names. It's possible to alleviate it to some extent using part-of-speech aware indexing. Not sure if Kendra does it in its ML backend.
by xfalcox on 12/3/19, 7:22 PM
by mlboss on 12/3/19, 7:35 PM
by vkaku on 12/4/19, 6:28 AM
by genS3 on 12/3/19, 7:43 PM