AWS Machine Learning Blog
Improve search accuracy with Spell Checker in Amazon Kendra
Amazon Kendra is an intelligent search service powered by machine learning. You can receive spelling suggestions for misspelled terms in your queries by utilizing the Amazon Kendra Spell Checker. Spell Checker helps reduce the frequency of queries returning irrelevant results by providing spelling suggestions for unrecognized terms.
In this post, we explore how to use Amazon Kendra Spell Checker on the AWS Management Console, as well as how to enable Spell Checker in an Amazon Kendra-powered search application through the AWS Command Line Interface (AWS CLI) and AWS SDK.
Use Amazon Kendra Spell Checker on the console
You can automatically receive spelling suggestions for your misspelled Amazon Kendra queries when querying through the console.
On the Amazon Kendra console, choose your desired index, then choose Search indexed content in the navigation pane. Make sure that the selected index has ingested documents; in this post, we use the sample AWS documentation found in the Data sources section of the navigation pane.
On the Amazon Kendra search console, simply submit a query as you usually would. Misspelled terms in the query are substituted with suggested terms in the “Did you mean” section of the search console.
Choosing the suggested query submits a new query with the corrected spelling.
As you can see, the query results provided through the suggested query are significantly more relevant, thanks to Spell Checker!
Use Amazon Kendra Spell Checker in search applications
Search applications powered by Amazon Kendra can quickly and easily enable Spell Checker through the AWS CLI or AWS SDK, which we walk through in this section. Additionally, we go over an example of how to process the Spell Checker response.
AWS CLI
Let’s look at how AWS CLI users can opt in to Amazon Kendra Spell Checker to receive spelling suggestions for misspelled query terms. We use the AWS CLI to query Amazon Kendra as usual, with only one small change: we include the --spell-correction-configuration IncludeQuerySpellCheckSuggestions=true
argument:
In addition to the normal query results, the response from Amazon Kendra now contains a SpellCorrectedQueries
object, if there are any spelling suggestions for the query. For more information, see SpellCorrectedQuery.
AWS SDK
Next, let’s walk through how Amazon Kendra provides spell check functionality for AWS SDK users. For this example, we use Python 3. We submit a query with a few spelling errors, and print out the SpellCorrectedQueries
object in the response:
The response from Amazon Kendra now contains the expected spelling suggestions:
Process the Amazon Kendra Spell Check response
Now that we’ve gone over how to programmatically get spelling suggestions through either the AWS CLI or AWS SDK, we can examine how we turn the response into a human-readable suggested query. For this example, we use the sample output from the previous section:
Each SpellCorrectedQuery
has two keys: SuggestedQueryText
and Corrections
.
SuggestedQueryText
maps to a string containing the updated query with the suggested spelling corrections.Corrections
maps to a list ofCorrection
objects, which contains the beginning and ending offset of the correction, as well as the original term from the query and the spelling suggestion for that term.
For our example, we want to show the suggested query text with the newly suggested terms italicized, similar to what is done on the Amazon Kendra console. To achieve this, we can add HTML italics opening tags <i>
at the BeginOffset
of each Correction
and HTML italics closing tags </i>
at the EndOffset
of each Correction
in the Corrections
list. Note that BeginOffset
and EndOffset
are based on the length of the corrected terms, not the original terms.
Adding the italics tags to SuggestedQueryText
gives us the following suggested query text:
As you can see, Amazon Kendra Spell Checker makes it simple to add spell check functionality to your search application.
Conclusion
Spell Checker is a new, powerful feature offered by Amazon Kendra. Spell Checker is a simple, effective way to quickly reduce the number of unhelpful queries by providing spelling suggestions to end-users for misspelled terms.
Spell Checker is available in all AWS Regions where Amazon Kendra is available, and supports all languages currently supported by Amazon Kendra.
To learn more about Amazon Kendra, visit the Amazon Kendra product page.
About the Author
Matthew Peretick is a Software Development Engineer at Amazon Web Services based in New York City. Matthew is a member of the Amazon Kendra team focused on enhancing the Amazon Kendra query experience.