AWS Machine Learning Blog

Configuring your Amazon Kendra Confluence Server connector

Many builders and teams on AWS use Confluence as a way of collaborating and sharing information within their teams and across their organizations. These types of workspaces are rich with data and contain sets of knowledge and information that can be a great source of truth to answer organizational questions.

Unfortunately, it isn’t always easy to tap into these data sources to extract the information you need. For example, the data source might not be connected to an enterprise search service within the organization, or the service is outdated and lacks natural language search capabilities, leading to poorer search experiences.

Amazon Kendra is an intelligent search service powered by machine learning (ML). Amazon Ken­dra reimagines enterprise search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple locations and content repositories within your organization.

Amazon Kendra lets you easily add data sources using a wide range of connector types, so you can use its intelligent search capabilities to search your content repositories. Amazon Kendra maintains document access rights and automatically syncs with your index to make sure you’re always searching the most up-to-date content.

In this post, we walk through the process of setting up your Amazon Kendra connector for Confluence Server.

Prerequisites

The post assumes that you have Confluence set up and an index created in Amazon Kendra. For instructions on setting up your index, see Creating an index.

Creating the Confluence connector

To set up your Confluence connector, complete the following steps:

  1. On the Amazon Kendra console, navigate to your index and choose Add data sources.

  1. From the list of available connectors, choose Confluence Server.
  2. Choose Add connector.

Next, we need to specify the data source details.

  1. For Data source name, enter a name.
  2. For Description, enter an optional description.

The next step is data access and security.

  1. For Confluence URL, enter the URL to your Confluence site.

If your site is running in a private VPC, you must configure Amazon Kendra to access your VPC resources.

  1. In the Set authentication section, for Type of authentication, you can choose to create new authentication credentials or use an existing one. (For this post, we choose New.)
  2. For Secret name, enter a name.
  3. For User name¸ enter your Confluence account user name.
  4. For Password, enter a password.

This information is stored in AWS Secrets Manager.

  1. In the Set IAM role section, choose the AWS Identity and Access Management (IAM) role that Amazon Kendra uses to crawl your Confluence data and update the index.

At minimum, the role should have permission to create and update indexes in Amazon Kendra and read your Confluence credentials from Secrets Manager.

In the Configure sync settings section, you set up your index sync options.

  1. For Set sync scope, choose to include or exclude specific Confluence workspaces.
  2. For Set sync run schedule, choose the schedule you want for your sync jobs. Each data source can have its own update schedule.

Custom attributes allow you to add additional metadata to your documents in the index. For example, you can create a custom attribute called Department with values HR, Sales, and Manufacturing. You can apply these attributes to your documents so that you can limit the response to documents in the HR department, for example.

  1. In the field mapping section, you can choose the mappings of Confluence fields to Amazon Kendra fields in the index. You can update required fields, recommended fields, and additional suggested field mappings.

  1. Review your settings summary to check if everything looks okay and choose Add data source.

Starting the Confluence connector manually

After you create your data source, you can start the sync process manually by choosing Sync now.

When the sync job is complete, the status shows as Succeeded.

Testing the results

After the sync job is complete, you can search many different ways. For this post, we walk through using the Amazon Kendra console to test the results. For more information, see Querying an index (console).

In the navigation pane, choose Search console.

Now you can search the index.

Conclusion

In this post, we walked through the process of creating and running the Confluence Server data source connector. This connector enables you to connect to a Confluence data source, specify which areas to crawl, and how to process field metadata elements and other key functions.

By doing this, you can use the intelligent search capabilities of Amazon Kendra, powered by ML, on your Confluence Server content. To see a full list of data sources currently supported by Amazon Kendra, see Data sources.

 


About the Authors

Ben Snively is an AWS Public Sector Specialist Solutions Architect. He works with government, non-profit, and education customers on big data/analytical and AI/ML projects, helping them build solutions using AWS.

 

 

 

Sam Palani is an AI/ML Specialist Solutions Architect at AWS. He works with public sector customers to help them architect and implement machine learning solutions at scale. When not helping customers, he enjoys long hikes, unwinding with a good book, listening to his classical vinyl collection and hacking projects with Raspberry Pi.