Data collection endpoints in Azure are essentially entry points for data to be processed and analyzed. They can be used to collect data from various sources, such as IoT devices, web applications, or even social media.
Azure provides a range of data collection endpoint options, including Azure Functions, Azure Logic Apps, and Azure Event Grid. These services can be used to collect, process, and route data to various destinations.
One of the key benefits of using Azure data collection endpoints is the ability to handle large volumes of data in real-time. This is particularly useful for IoT devices that generate a high volume of data.
To get started with Azure data collection endpoints, you'll need to create a new endpoint and configure it to collect data from your desired source.
For another approach, see: Why Is Accurate Data Collection Important
Configuration
To collect data in Azure, you need to configure several components. The application registration is used to authenticate the API call, and it must be granted permission to the DCR.
The table in the Log Analytics workspace must exist before you can send data to it. You can use one of the supported Azure tables or create a custom table using any of the available methods.
A Data Collection Rule (DCR) is used to understand the structure of the incoming data and what to do with it. If the structure of the table and the incoming data don't match, the DCR can include a transformation to convert the source data to match the target table.
Here's a summary of the components you need to configure:
The DCR can include a transformation to convert the source data to match the target table, and you must grant access to it for the application that you created.
Creating and Managing Resources
Creating a data collection endpoint is a straightforward process in Azure Monitor. To do this, navigate to the Azure portal and select Data Collection Endpoints under the Settings section, then click Create to start the process.
Intriguing read: Why Collecting Data Is Important
You'll need to provide a Rule name, specify a Subscription, Resource Group, and Region to determine where the DCE will be created. This information is essential to ensure that your data collection endpoint is correctly set up.
To create a new data collection endpoint, follow the steps outlined in the Azure Monitor menu. Select Create to create a new endpoint, and then review the details by clicking Review + create. Once you've reviewed the details, select Create to create the DCE.
Here are the key properties of a DataCollectionEndpointResource:
The resource provisioning state is an important property to note, as it indicates the current state of the resource.
Client Libraries
If you're working with the Logs ingestion API, you can use client libraries to make things easier. These libraries require the same configuration components as making a REST API call.
You can use .NET, Go, Java, JavaScript, or Python client libraries to send data to the Logs ingestion API. For examples of how to use each of these libraries, see the sample code provided in the documentation.
The client libraries are designed to simplify the process of sending data to the Logs ingestion API. This can save you time and effort in your development process.
Here are some of the client libraries you can use:
- .NET
- Go
- Java
- JavaScript
- Python
Create a Resource
To create a resource, you'll need to start by creating a data collection endpoint. This involves selecting Data Collection Endpoints under the Settings section in the Azure Monitor menu.
You can create DCEs by using the DCE REST APIs, which allows for a more programmatic approach.
To begin the process, navigate to the Azure Monitor menu in the Azure portal and select Data Collection Endpoints under the Settings section. From there, select Create to create a new Data Collection Endpoint.
You'll need to provide a Rule name, specify a Subscription, Resource Group, and Region, which will determine where the DCE will be created.
Here are the basic steps to create a new endpoint:
- Select Create to create a new endpoint.
- Provide a Rule name and specify a Subscription, Resource Group, and Region.
- Select Review + create to review the details of the DCE.
Once you've created your DCE, you can associate it with your target machines or resources by using the DCRA REST APIs.
Az List
You can use the az command to list all data collection endpoints in a specified subscription. This is done by running the command az monitor data-collection endpoint list.
To list data collection endpoints by resource group, you can use the az monitor data-collection endpoint list command with the --resource-group argument. You can also list data collection endpoints by subscription using the same command.
The total number of items to return in the command's output can be specified using the --max-results argument. If the total number of items available is more than the value specified, a token is provided in the command's output.
To resume pagination, you can provide the token value in the --next-token argument of a subsequent command. The token to specify where to start paginating is the token value from a previously truncated response.
Azure Agent
The Azure Agent is a crucial component in monitoring and managing your Azure resources. It's called Azure Monitor Agent (AMA) for short.
AMA will use a public endpoint by default to retrieve its configuration from Azure Monitor. This is a good thing, as it's easy to set up. However, if you're using private link, you'll need to use a Direct Connect Endpoint (DCE).
A DCE is required if you're using private link, especially if you're connected to a network that shares DNS with Azure Monitor Private Link Scope (AMPLS) resources.
You can view the agents associated with a DCE from its Resources page, where you can also add or remove agents.
A DCE is only required for certain data sources, such as IIS Logs, Windows Firewall Logs, Text Logs, JSON Logs, and Prometheus Metrics (Container Insights).
Here's a quick rundown of the data sources that require a DCE:
- IIS Logs
- Windows Firewall Logs
- Text Logs
- JSON Logs
- Prometheus Metrics (Container Insights)
If you're using one of these data sources, you'll need to add the DCE to AMPLS, especially if the data is being sent to a destination configured for private link.
Resource
Creating a data collection endpoint is a straightforward process in Azure Monitor. You can create one by using the Azure portal or the DCE REST APIs. To create one using the portal, go to the Azure Monitor menu, select Data Collection Endpoints under the Settings section, and then select Create to create a new endpoint.
A data collection endpoint resource has several properties, including etag, id, identity, kind, location, name, and type. These properties can be found in the DataCollectionEndpointResource definition.
To delete a data collection endpoint, you can use the az monitor data-collection endpoint delete command. You can specify one or more resource IDs, which should be a complete resource ID containing all information of 'Resource Id' arguments.
Here are some key properties of a data collection endpoint resource:
You can also list all data collection endpoints in the specified subscription using the az monitor data-collection endpoint list command. This command can also be used to list data collection endpoints by resource group or subscription.
Data Collection Endpoint Components
A data collection endpoint is made up of three key components: Logs ingestion endpoint, Metrics ingestion endpoint, and Configuration access endpoint. These components work together to ingest data into Azure Monitor and send configuration files to Azure Monitor Agent.
Consider reading: Key Components of Azure Data Factory
The Logs ingestion endpoint is responsible for ingesting logs into the data ingestion pipeline, where Azure Monitor transforms the data and sends it to the defined destination Log Analytics workspace and table.
Here are the components of a data collection endpoint:
DCE Components
A data collection endpoint, or DCE, is made up of several key components. These components work together to ingest data into Azure Monitor and send configuration files to Azure Monitor Agent.
The Logs ingestion endpoint is the first component, responsible for ingesting logs into the data ingestion pipeline. It transforms the data and sends it to the defined destination Log Analytics workspace and table based on a DCR ID sent with the collected data.
You can think of the Metrics ingestion endpoint as the counterpart to the Logs ingestion endpoint. It ingests metrics into the data ingestion pipeline and sends them to the defined destination Azure Monitor workspace and table.
Expand your knowledge: Azure Data Ingestion
The Configuration access endpoint allows Azure Monitor Agent to retrieve data collection rules (DCRs) from the endpoint. This is important because it enables Azure Monitor Agent to get the rules it needs to collect data.
Here's a breakdown of the components and their regionality considerations:
If you're sending data to a Log Analytics workspace configured for private link, you must use a DCE.
Logs Ingestion API
The Logs Ingestion API is a key component of data collection, allowing you to send logs directly to Azure Monitor without the need for a Data Collection Endpoint (DCE). This approach is particularly useful for clients who want to ingest logs efficiently.
You can create a Data Collection Rule (DCR) specifically for the Logs Ingestion API, which will provide you with a logsIngestion property - an endpoint to send logs using the API. This endpoint is what you'll use to send data to Azure Monitor.
Using the Logs Ingestion API eliminates the need for a DCE, but you can still choose to use one if you prefer. However, if you're sending data to a Log Analytics workspace configured for private link, you must use a DCE.
Explore further: Azure Data Factory Rest Api
Definitions
The "Definitions" section of data collection endpoint components is a treasure trove of information.
ConfigurationAccess is the endpoint used by clients to access their configuration.
A client's configuration is a crucial part of their setup, and this endpoint makes it easily accessible.
The createdByType indicates the type of identity that created the resource.
This is an important piece of information for tracking the origin of a resource and who is responsible for it.
DataCollectionEndpointResource defines ARM tracked top-level resources.
ARM stands for Azure Resource Manager, which is a powerful tool for managing Azure resources.
The ErrorResponseCommonV2 is an error response that provides common error details.
When an error occurs, this response helps to identify the problem and provide a solution.
Here are some of the key definitions from the "Definitions" section:
The Identity of a resource is its managed service identity.
This identity is used to authenticate and authorize the resource.
The KnownDataCollectionEndpointProvisioningState indicates the provisioning state of the resource.
This state shows whether the resource is being created, updated, or deleted.
The LocationSpec is a specification for the location of the resource.
This specification is used to determine where the resource is located.
The ManagedServiceIdentityType is the type of managed service identity.
This type can be either SystemAssigned or UserAssigned.
The Metadata is metadata for the resource.
This metadata provides additional information about the resource.
The NetworkAcls are network access control rules for the endpoints.
These rules determine who can access the endpoints and what actions they can perform.
The PrivateLinkScopedResource is a private link scoped resource.
This resource is used to provide private access to the endpoints.
The SystemData contains metadata pertaining to creation and last modification of the resource.
This metadata provides information about when the resource was created and last modified.
The UserAssignedIdentity is a user-assigned identity.
This identity is assigned to the user and is used for authentication and authorization.
Frequently Asked Questions
What is endpoint data collection?
Our endpoint data collection gathers information on endpoint activities to detect suspicious behavior and identify potential threats. This data is then analyzed to uncover malicious patterns and activities.
Sources
- https://learn.microsoft.com/en-us/azure/azure-monitor/logs/logs-ingestion-api-overview
- https://codewithme.cloud/posts/2024/05/custom-log-ingestion-azure-law/
- https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/data-collection-endpoint-overview
- https://learn.microsoft.com/en-us/cli/azure/monitor/data-collection/endpoint
- https://learn.microsoft.com/en-us/rest/api/monitor/data-collection-endpoints/create
Featured Images: pexels.com