Learn Azure Data Factory Rest Api Invocation and Configuration

Author

Reads 367

Man in White Dress Shirt Analyzing Data Displayed on Screen
Credit: pexels.com, Man in White Dress Shirt Analyzing Data Displayed on Screen

To learn Azure Data Factory REST API invocation and configuration, start by understanding the key components involved.

The Azure Data Factory REST API is a powerful tool for automating data integration tasks.

To invoke the API, you'll need to use the Azure Data Factory Management API, which is a REST-based API that allows you to create, update, and delete data factories, pipelines, datasets, and other related entities.

The API uses standard HTTP verbs like GET, POST, PUT, and DELETE to perform operations.

Linked Service Setup

To set up a linked service in Azure Data Factory for API calls, you'll need to create a new Linked service for REST API. Provide Authentication Type and AAD resource values as mentioned below: Authentication Type should be System Assigned Managed Identity and AAD resource should be https://graph.microsoft.com/.

There are two ways to create a REST linked service, either through the Azure portal UI or by providing specific properties. To create a REST linked service using the UI, browse to the Manage tab in your Azure Data Factory or Synapse workspace and select Linked Services, then select New.

Credit: youtube.com, 48. How to Copy data from REST API to Storage account using Azure Data Factory | #adf #datafactory

The following properties are supported for the REST linked service: type, url, enableServerCertificateValidation, authenticationType, authHeaders, and connectVia. The type property must be set to RestService and url is the base URL of the REST service.

Here's a summary of the required properties for the REST linked service:

Authentication types include Anonymous, Basic, AadServicePrincipal, OAuth2ClientCredential, and ManagedServiceIdentity.

Authentication

Authentication is a crucial aspect of Azure Data Factory's REST API. You can use various authentication methods to connect to your data sources, including Managed Identity, Basic Authentication, Service Principal, OAuth2 Client Credential, and Anonymous Authentication.

Azure Data Factory automatically creates an app in Azure Active Directory when you create a data factory, making it easy to grant access to Graph API or other services. This is known as Managed Identity.

To use Managed Identity, you need to set the authenticationType property to ManagedServiceIdentity and specify the Microsoft Entra resource you are requesting for authorization, such as https://management.core.windows.net.

Credit: youtube.com, 40. Working with Token based REST API in Azure Data Factory

You can also use user-assigned managed identity, which requires specifying the user-assigned managed identity as the credential object, in addition to the aadResourceId.

Basic Authentication involves setting the authenticationType property to Basic and specifying the user name and password to access the REST endpoint. You can store the password securely in Data Factory as a SecureString.

Service Principal authentication requires setting the authenticationType property to AadServicePrincipal and specifying the Microsoft Entra application's client ID and tenant information. You can also use a service principal key or certificate for authentication.

OAuth2 Client Credential authentication involves setting the authenticationType property to OAuth2ClientCredential and specifying the token endpoint, client ID, client secret, and scope of the access required.

Here's a summary of the authentication types and their required properties:

By using the correct authentication method and specifying the required properties, you can securely connect to your data sources and perform API calls using Azure Data Factory's REST API.

API Invocation

Credit: youtube.com, 139. Using Web Activity with Azure Data Factory | Call REST API using Azure Data factory

API Invocation is a crucial step in working with Azure Data Factory's REST API. To invoke the API, you need to specify the base URL of the REST service, which is a required property.

You can set the base URL in the "url" property, which is also a required property. This is where you'll enter the full URL of your REST service.

Authentication headers can also be specified for additional HTTP request headers. This can be useful for authentication types like API key authentication, where you can select the authentication type as "Anonymous" and specify the API key in the header.

Invoking REST API

Invoking REST API is a crucial step in API invocation. You can use the REST connector as a source or sink in Azure Data Factory to invoke REST APIs.

The REST connector as a source requires the type property to be set to RestSource. This is a must-have, as it tells the connector that you're using a REST API as the source. The requestMethod property can be set to either GET or POST, but GET is the default.

Credit: youtube.com, How to Invoke a REST API from Server Code Script in 5 Minutes

Additional headers can be added using the additionalHeaders property, but the REST connector ignores any "Accept" header specified in these headers. Instead, it will auto-generate a header of Accept: application/json, as it only supports responses in JSON.

When using pagination, the array of objects as the response body is not supported. However, you can use the paginationRules property to compose next page requests.

Here's a table summarizing the properties for the REST connector as a source:

For the REST connector as a sink, the type property must be set to RestSink. This is a must-have, as it tells the connector that you're using a REST API as the sink. The requestMethod property can be set to POST, PUT, or PATCH, with POST being the default.

The httpCompressionType property can be set to either none or gzip, but none is the default. The writeBatchSize property determines the number of records to write to the REST sink per batch, with a default value of 10000.

When using the REST connector as a sink, the data will be sent in JSON with a specific pattern. You can use the copy activity schema mapping to reshape the source data to conform to the expected payload by the REST API.

Copy Activity Properties

Credit: youtube.com, 7. Copy Behavior property in Copy Activity | Azure Data Factory

In API invocation, understanding the properties of a copy activity is crucial for successful data transfer.

The copy activity source section supports a list of properties, which are essential for data retrieval.

This list includes properties that enable the source to function correctly.

The copy activity sink section also has a list of supported properties, which are vital for data storage.

These properties ensure that data is properly written to the sink.

By understanding and utilizing these properties, you can optimize your API invocation process.

Configuration

To configure the Azure Data Factory REST API, you need to define the connector configuration details.

The authentication type must be set to Basic, which requires specifying the user name and password. The user name is the value used to access the REST endpoint, and it's a required property.

To store the password securely, mark the field as a SecureString type or reference a secret stored in Azure Key Vault.

Here's a summary of the required properties for basic authentication:

Use Basic

Computer server in data center room
Credit: pexels.com, Computer server in data center room

To use basic authentication, you need to set the authenticationType property to Basic. This is a straightforward process that requires a few key properties to be specified.

The userName property must be set to the user name you want to use to access the REST endpoint. This is a required property, so don't forget to include it in your configuration.

The password property is also required and must be set to the password for the user. To store the password securely, mark this field as a SecureString type in Data Factory. You can also reference a secret stored in Azure Key Vault.

Here's a summary of the properties you need to specify for basic authentication:

By following these simple steps, you can configure basic authentication in your Data Factory and start accessing your REST endpoint securely.

Dataset Properties

To configure your dataset, you'll need to set the type property to RestResource. This is a requirement for REST datasets.

Credit: youtube.com, Using dataset Properties

The type property must be set to RestResource, as this is the only type supported by REST datasets.

If you want to copy data from a specific resource, you'll need to specify a relative URL in the dataset. This can be a useful feature if you need to access data from a specific location within a larger dataset.

The relative URL property is not required, but it can be useful for accessing specific data within a larger dataset.

Here is a summary of the properties supported for REST datasets:

The relative URL property can be used to access specific data within a larger dataset, and it's not required if you only need to access the URL specified in the linked service definition.

Mapping Properties

Mapping properties is a crucial step in configuration. The copy activity source section supports properties like those found in the REST source and sink.

To specify the source for your data, you'll need to list the properties supported by the REST source and sink, which include those found in the copy activity source section.

Credit: youtube.com, Material Configuration - Episode 4 - Store Mapping and Property Variable

In data flows, REST is supported for both integration datasets and inline datasets. This means you can use it to map data flow properties.

For schema mapping, refer to the section on schema mapping to learn how to copy data from a REST endpoint to a tabular sink.

You'll need to map the data flow properties to ensure a smooth transfer of data.

Source Transformation

In source transformation, you can specify the HTTP method, which must be either GET or POST. This is a required property.

The HTTP method is crucial in determining how the data is retrieved from the source.

You can also specify a relative URL to the resource that contains the data. This is optional and can be used to combine with the URL specified in the linked service definition.

The relative URL is used to create a combined URL, which is used to retrieve the data.

If you need to add additional HTTP request headers, you can do so in the source transformation.

Credit: youtube.com, Source transformation in network analysis

Additional HTTP request headers can be used to authenticate or authorize the request.

You can also specify a timeout for the HTTP request to get a response. The default value is 00:01:40.

The timeout value is the time it takes to get a response, not the time it takes to write the data.

To avoid overwhelming the source, you can specify an interval time between different requests in milliseconds. The interval value should be between 10 and 60000.

The request interval is used to control the rate at which requests are sent to the source.

Here is a summary of the properties you can use in source transformation:

Pagination and Response Handling

Pagination and Response Handling is a crucial aspect of working with Azure Data Factory's REST API. The generic REST connector supports various pagination patterns, including using the next request's absolute or relative URL, query parameter, or header based on values in the current response body or headers.

Credit: youtube.com, 101. Copy activity - Pagination rules - When API response have URL for next page #azuredatafactory

To configure pagination, you'll need to define a dictionary of pagination rules in your dataset. This dictionary contains case-sensitive key-value pairs that the connector will use to generate requests starting from the second page.

The connector will stop iterating when it receives an HTTP status code 204 (No Content) or when a JSONPath expression in the pagination rules returns null. Here are some supported keys in pagination rules:

Supported values in pagination rules include:

In mapping data flows, pagination rules are defined differently than in copy activity. Range is not supported, and instead of using "[]", you should use "{}" to escape special characters. The end condition is supported, but the condition syntax is different, using "body" to indicate the response body and "header" to indicate the response header.

Credit: youtube.com, 32. Copy data from REST API which sends response in Pages using Azure data factory

When handling the response from a List By Factory endpoint, you can iterate over the results and process them as needed for your workflow. To avoid endless requests when the range rule is not defined, you can set an end condition, such as referring to the "paging.next" URL in the response structure.

Katrina Sanford

Writer

Katrina Sanford is a seasoned writer with a knack for crafting compelling content on a wide range of topics. Her expertise spans the realm of important issues, where she delves into thought-provoking subjects that resonate with readers. Her ability to distill complex concepts into engaging narratives has earned her a reputation as a versatile and reliable writer.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.