The SharePoint import agent for Elimity Insights imports users, groups and sites from SharePoint and uploads the results to your Elimity Insights server. You only have to configure the SharePoint credentials and the connection with your Elimity Insights server. The import agent will then connect to SharePoint, perform the necessary requests, transform the results to a format compatible with Elimity Insights and send the transformed data to your Elimity Insights server.
Data model
The SharePoint import agent models employs the following entity types:
- Users: user accounts as listed by the SharePoint API.
- Groups: groups on the level of a site as listed by the SharePoint API.
- Role: roles on the level of a file or rolder as listed by the SharePoint API.
- Files: a file in the library of a Site as listed by the SharePoint API.
- Site: a SharePoint site as listed by the SharePoint API.
Notes about the users:
- The users listed by the API represent user account of several types:
- User accounts in your Azure AD instance, typically of people in your organization that access your SharePoint by logging in with their Azure AD user account.
- Groups in your Azure AD instance. SharePoint refers to Azure AD groups in two ways:
- A SharePoint user with a name like "<groupname> Members" (or translated in your language) grants access rights to the members of the respective Azure AD group.
- A SharePoint user with a name like "<groupname> Owners " (or translated in your language) grants access rights to the owners of the respective Azure AD group.
- Guest Contributor users. These user accounts are created by SharePoint when a file or folder has been shared anonymously (i.e., through a link that does not require authentication) and the sharee has actually made contributions. In other words, when a shared link is created with the option "Anyone with the link can edit", a Guest Contributor user account will be created when the recipient of the link does not have a Microsoft account and has made edits.
- LinkClaim users: Claim based users that are verified by using a set of information. They are typically linked to Guest contributors. The ID of the LinkClaim users is also present in the ID of Guest Contributor users. The key concept is that a claim is not just a unique identifier that identifies the resource, application, or user. It is a set of claims (that is, values) that is used to describe the resource, application, or user. The claims are also used to authorize access.
- SharePoint itself does not deactivate user accounts. Moreover, SharePoint does not delete user accounts that have been deleted from Azure AD. As a result, the list of users that you see in SharePoint might be much longer than the ones that you currently see in Azure AD, as the list in SharePoint are actually users that have been created over a long time.
Notes about groups:
- The groups listed in Elimity Insights represent groups of several types:
- Standard SharePoint groups, that represent bundles of users that are granted certain accesses.
- Limited Access System Group groups, which are created by SharePoint to represent that specific permissions have been granted to a resource. These groups cannot be assigned directly to a user or group in SharePoint itself. Instead, when you assign edit or open permissions to a single item, SharePoint automatically assigns Limited Access to other required locations, such as the site or library in which the single item is located.
- SharingLinks groups, which are created by SharePoint to represent that a file or folder has been shared with a link. If this sharing link is removed again in SharePoint, SharePoint also removes the respective SharingLink group.
- The "All Users" group, which has been created by Elimity Insights itself in order to bundle all the users of a specific site.
- The "Site Administrators" group, which has been created by Elimity Insights itself in order to make it easier to query the site administrators of a site.
Import agent
Installation
The import agent is available as a Docker image for Linux and Windows. Refer to the dedicated knowledge base article for all details about working with agents and gateways. Visit https://console.cloud.google.com/artifacts/docker/elimity-general/europe-west1/docker/sharepoint-import-agent to get a list of available image tags for this specific agent.
Step-by-step deployment guide
The following sections will explain the different steps you'll need to take to deploy the Sharepoint import agent for Elimity Insights.
1. Setting up the main app registration in Entra ID
The Sharepoint connector for Elimity Insights authenticates as an Entra ID enterprise application. Create a new app registration in Entra ID by following these steps:
- Register a new application ('App registrations' > 'New registration')
- Name: e.g. 'elimity-insights-sharepoint'
- Leave other configurations untouched, simply click 'Register'
- Note down the client and tenant identifiers
- Assign Graph API permissions to the newly created app registration
- 'API permissions' > 'Add a permission'
- 'Microsoft Graph' > 'Application permissions' > 'Sites.Read.All'
- Grant admin consent for these permission assignments
- Generate a client secret for the app registration ('Certificates & secrets' > 'Client secrets' > 'New client secret') and securely note down the secret value
2. Setting up the worker app registrations in Entra ID
Detailed scanning of SharePoint sites takes quite a bit of time. Customers must provide at least one 'worker' app registration, but we recommend multiple workers for large SharePoint tenants.
To set up 'worker' app registrations, follow the a procedure similar to the one described in step 1 to create one or more new app registrations, but instead of the Graph permissions assign the following SharePoint permissions:
- If you don't want to import file permissions, then grant 'SharePoint' > 'Application permissions' > 'Sites.Read.All'.
- If you want to import file permissions, then grant 'SharePoint' > 'Application permissions' > 'Sites.FullControl.All' instead of 'Microsoft Graph' > 'Application permissions' > 'Sites.Read.All'
Additionally, instead of generating a client secret, generate and upload a certificate for each app registration:
- Generating a certificate pair is typically customer-specific, the example command below uses OpenSSL:
openssl req -days 999 -keyout key.pem -newkey rsa -nodes -out cert.pem -subj '/CN=elimity-insights' -x509
- Securely note down the private key.
- Upload the certificate to the worker app registration in Entra ID and note down the certificate thumbprint that you see in Entra ID.
3. Creating a SharePoint source in Elimity Insights
You can now create a built-in SharePoint source in your Elimity Insights instance. Uncheck the 'Enable automatic imports' option and generate API credentials from the 'SETTINGS' tab instead. Securely note down the new source's identifier and token.
4. Configuring the agent
To configure your import agent, mount an HJSON configuration file at `/app/config/config.hjson` with the properties listed below. You can find an example in the attachments at the bottom of this page.
- `clientId`: unique identifier of the main app registration you set up in step 1
- `clientSecret`: client secret value for the main app registration you set up in step 1
- `cronPattern`: optional CRON pattern describing when the import agent should run (refer to https://crontab.guru for example patterns); omit if you just want to run the agent once
- `includeRegularFiles`: boolean indicating whether you also want to scan regular files (not only folders)
- `insightsSourceId`: unique identifier of the SharePoint source you created in step 3
- `insightsSourceToken`: secret token for the SharePoint source you created in step 3
- `insightsUrl`: HTTP(S) API URL of your Elimity Insights instance, e.g. `https://example.elimity.com/api`
- `listItemPageSize`: maximum number of SharePoint list items to retrieve in a single page, we recommend a value of 5000
- `roleAssignmentChunkSize`: maximum number of SharePoint role assignments to retrieve in a single batch, we recommend a value of 100
- `skipFiles`: boolean indicating whether want to skip scanning regular files and folders for each site
- `skipPersonalSites`: boolean indicating whether you want to skip scanning personal sites
- `skipRoleAssignments`: boolean indicating whether you want to skip scanning role assignments for each file
- if you set this property to `false`, make sure you granted `Sites.FullControl.All` in step 2
- if you set this property to `true`, granting `Sites.Read.All` in step 2 should suffice
- `targets`: array of objects describing which sites you want to import; use an empty array to target all sites in your tenant; each object has the following properties:
- `hostname`: hostname of the SharePoint site you want to import, e.g. `example-tenant.sharepoint.com`
- `path`: absolute path of the SharePoint site you want to import, e.g. `/sites/ExampleSite`
- `tenantId`: unique identifier of your Entra ID tenant, which you noted down in step 1
- `workers`: object mapping client identifiers for worker app registrations to credential objects with the following properties:
- `privateKey`: private key for the worker’s app registration you set up in step 2
- `thumbprint`: thumbprint for the worker’s app registration you set up in step 2
5. Deploying the agent
Having configured the agent and having created a source in Elimity Insights, you can now deploy the agent to regularly import data from your SharePoint tenant and upload it to Elimity Insights. Since we distribute the agent as a Docker image, our recommendation for deployment is to use your cloud provider’s dedicated job execution platform (e.g. Google Cloud Run, Azure Container Apps, …). If that's not an option, you can also manually deploy the image on e.g. Windows Server. Refer to the dedicated knowledge base article about installing import agents for additional details.
6. Following up on the import
The import agent outputs logs to indicate its progress, for a manual Windows Server deployment you can check these with `docker-compose logs sharepoint-import-agent`.
Guidelines for Importing SharePoint Sites
Importing SharePoint data can be complex due to the volume of sites, folders, and files involved. To streamline the process, it’s important to make clear decisions about which sites to include and how deeply to scan their contents.
To help you along, you can follow the following steps for efficient SharePoint imports:
1. Define Your Goals
- Determine the purpose of the import. For example:
- Cleaning up access across all SharePoint sites.
- Monitoring access to critical folders or files within specific sites.
- Assess whether your focus is on personal sites, non-personal sites, or both.
2. Decide on Import Scope
- All Sites: Import permissions and data for all SharePoint sites if your goal is to obtain comprehensive visibility.
- Skip Non-Personal Sites: Exclude non-personal sites (e.g., team sites or departmental repositories) to focus only on user-specific content.
-
Specific Sites: Identify key SharePoint sites that align with your goals.
- Example: Critical business processes or sensitive data repositories.
For the selected sites:
- Choose whether to include the files and folders, and whether to limit to folders only:
- Folders only: This provides a high-level view of permissions without the overhead of scanning individual files.
- Files and folders: Opt for this if you need detailed insights into file-level access.
3. Evaluate Site Contents
- If you want insights in the files and folders within a site, analyze the size and structure of selected sites to estimate the number of folders and files.
- Consider the complexity of nested folders and the volume of files in each.
4. Model the Data in Elimity Insights
- Decide how to organize the imported data:
- One Large Source: Ideal for a unified view of permissions across all SharePoint sites but may result in heavy imports.
- Smaller Sources: Recommended if certain sites require separate handling due to unique access needs or high data volume.
- Take these factors into account:
- Site size: Limit each source to a manageable size, such as 200,000 folders or files, to ensure import efficiency.
- Frequency: Choose how often the data should be updated based on the criticality of the information.
- Access control: Is every user of Elimity Insights allowed to view all data, or do you want to limit certain users to specific sites? If so, you can split up the data in different sources and then use the access profiles of Elimity Insights to grant users access to the sources they need.
5. Configure the Data Import Agent
- Use Elimity’s agent to:
- List all SharePoint sites to be imported.
- Specify which sites to include or exclude.
- Configure the depth of scans for each site (folders only or folders + files).
- Schedule imports during low-usage periods to minimize server impact.
Comments
0 comments
Please sign in to leave a comment.