Submodules

airbyte_cdk.sources.declarative.retrievers.retriever module

class airbyte_cdk.sources.declarative.retrievers.retriever.Retriever

Bases: object

Responsible for fetching a stream’s records from an HTTP API source.

abstract read_records(records_schema: Mapping[str, Any], stream_slice: Optional[Mapping[str, Any]] = None) Iterable[Union[Mapping[str, Any], airbyte_protocol.models.airbyte_protocol.AirbyteMessage]]

Fetch a stream’s records from an HTTP API source

Parameters
  • records_schema – json schema to describe record

  • stream_slice – The stream slice to read data for

Returns

The records read from the API source

abstract property state: Mapping[str, Any]

State getter, should return state in form that can serialized to a string and send to the output as a STATE AirbyteMessage.

A good example of a state is a cursor_value:
{

self.cursor_field: “cursor_value”

}

State should try to be as small as possible but at the same time descriptive enough to restore syncing process from the point where it stopped.

abstract stream_slices() Iterable[Optional[Mapping[str, Any]]]

Returns the stream slices

airbyte_cdk.sources.declarative.retrievers.simple_retriever module

class airbyte_cdk.sources.declarative.retrievers.simple_retriever.SimpleRetriever(requester: airbyte_cdk.sources.declarative.requesters.requester.Requester, record_selector: airbyte_cdk.sources.declarative.extractors.http_selector.HttpSelector, config: typing.Mapping[str, typing.Any], parameters: dataclasses.InitVar[typing.Mapping[str, typing.Any]], name: str = <property object>, primary_key: typing.Optional[typing.Union[str, typing.List[str], typing.List[typing.List[str]]]] = <property object>, paginator: typing.Optional[airbyte_cdk.sources.declarative.requesters.paginators.paginator.Paginator] = None, stream_slicer: airbyte_cdk.sources.declarative.stream_slicers.stream_slicer.StreamSlicer = SinglePartitionRouter(), cursor: typing.Optional[airbyte_cdk.sources.declarative.incremental.cursor.Cursor] = None)

Bases: airbyte_cdk.sources.declarative.retrievers.retriever.Retriever

Retrieves records by synchronously sending requests to fetch records.

The retriever acts as an orchestrator between the requester, the record selector, the paginator, and the stream slicer.

For each stream slice, submit requests until there are no more pages of records to fetch.

This retriever currently inherits from HttpStream to reuse the request submission and pagination machinery. As a result, some of the parameters passed to some methods are unused. The two will be decoupled in a future release.

stream_name

The stream’s name

Type

str

stream_primary_key

The stream’s primary key

Type

Optional[Union[str, List[str], List[List[str]]]]

requester

The HTTP requester

Type

Requester

record_selector

The record selector

Type

HttpSelector

paginator

The paginator

Type

Optional[Paginator]

stream_slicer

The stream slicer

Type

Optional[StreamSlicer]

cursor

The cursor

Type

Optional[cursor]

parameters

Additional runtime parameters to be used for string interpolation

Type

Mapping[str, Any]

config: Mapping[str, Any]
cursor: Optional[airbyte_cdk.sources.declarative.incremental.cursor.Cursor] = None
must_deduplicate_query_params() bool
property name: str

Stream name

Type

return

paginator: Optional[airbyte_cdk.sources.declarative.requesters.paginators.paginator.Paginator] = None
parameters: dataclasses.InitVar[Mapping[str, Any]]
property primary_key: Optional[Union[str, List[str], List[List[str]]]]

The stream’s primary key

read_records(records_schema: Mapping[str, Any], stream_slice: Optional[Mapping[str, Any]] = None) Iterable[Union[Mapping[str, Any], airbyte_protocol.models.airbyte_protocol.AirbyteMessage]]

Fetch a stream’s records from an HTTP API source

Parameters
  • records_schema – json schema to describe record

  • stream_slice – The stream slice to read data for

Returns

The records read from the API source

record_selector: airbyte_cdk.sources.declarative.extractors.http_selector.HttpSelector
requester: airbyte_cdk.sources.declarative.requesters.requester.Requester
property state: Mapping[str, Any]

State getter, should return state in form that can serialized to a string and send to the output as a STATE AirbyteMessage.

A good example of a state is a cursor_value:
{

self.cursor_field: “cursor_value”

}

State should try to be as small as possible but at the same time descriptive enough to restore syncing process from the point where it stopped.

stream_slicer: airbyte_cdk.sources.declarative.stream_slicers.stream_slicer.StreamSlicer = SinglePartitionRouter()
stream_slices() Iterable[Optional[Mapping[str, Any]]]

Specifies the slices for this stream. See the stream slicing section of the docs for more information.

Parameters
  • sync_mode

  • cursor_field

  • stream_state

Returns

class airbyte_cdk.sources.declarative.retrievers.simple_retriever.SimpleRetrieverTestReadDecorator(requester: airbyte_cdk.sources.declarative.requesters.requester.Requester, record_selector: airbyte_cdk.sources.declarative.extractors.http_selector.HttpSelector, config: typing.Mapping[str, typing.Any], parameters: dataclasses.InitVar[typing.Mapping[str, typing.Any]], name: str = <property object>, primary_key: typing.Optional[typing.Union[str, typing.List[str], typing.List[typing.List[str]]]] = <property object>, paginator: typing.Optional[airbyte_cdk.sources.declarative.requesters.paginators.paginator.Paginator] = None, stream_slicer: airbyte_cdk.sources.declarative.stream_slicers.stream_slicer.StreamSlicer = SinglePartitionRouter(), cursor: typing.Optional[airbyte_cdk.sources.declarative.incremental.cursor.Cursor] = None, maximum_number_of_slices: int = 5)

Bases: airbyte_cdk.sources.declarative.retrievers.simple_retriever.SimpleRetriever

In some cases, we want to limit the number of requests that are made to the backend source. This class allows for limiting the number of slices that are queried throughout a read command.

maximum_number_of_slices: int = 5
stream_slices() Iterable[Optional[Mapping[str, Any]]]

Specifies the slices for this stream. See the stream slicing section of the docs for more information.

Parameters
  • sync_mode

  • cursor_field

  • stream_state

Returns

Module contents

class airbyte_cdk.sources.declarative.retrievers.Retriever

Bases: object

Responsible for fetching a stream’s records from an HTTP API source.

abstract read_records(records_schema: Mapping[str, Any], stream_slice: Optional[Mapping[str, Any]] = None) Iterable[Union[Mapping[str, Any], airbyte_protocol.models.airbyte_protocol.AirbyteMessage]]

Fetch a stream’s records from an HTTP API source

Parameters
  • records_schema – json schema to describe record

  • stream_slice – The stream slice to read data for

Returns

The records read from the API source

abstract property state: Mapping[str, Any]

State getter, should return state in form that can serialized to a string and send to the output as a STATE AirbyteMessage.

A good example of a state is a cursor_value:
{

self.cursor_field: “cursor_value”

}

State should try to be as small as possible but at the same time descriptive enough to restore syncing process from the point where it stopped.

abstract stream_slices() Iterable[Optional[Mapping[str, Any]]]

Returns the stream slices

class airbyte_cdk.sources.declarative.retrievers.SimpleRetriever(requester: airbyte_cdk.sources.declarative.requesters.requester.Requester, record_selector: airbyte_cdk.sources.declarative.extractors.http_selector.HttpSelector, config: typing.Mapping[str, typing.Any], parameters: dataclasses.InitVar[typing.Mapping[str, typing.Any]], name: str = <property object>, primary_key: typing.Optional[typing.Union[str, typing.List[str], typing.List[typing.List[str]]]] = <property object>, paginator: typing.Optional[airbyte_cdk.sources.declarative.requesters.paginators.paginator.Paginator] = None, stream_slicer: airbyte_cdk.sources.declarative.stream_slicers.stream_slicer.StreamSlicer = SinglePartitionRouter(), cursor: typing.Optional[airbyte_cdk.sources.declarative.incremental.cursor.Cursor] = None)

Bases: airbyte_cdk.sources.declarative.retrievers.retriever.Retriever

Retrieves records by synchronously sending requests to fetch records.

The retriever acts as an orchestrator between the requester, the record selector, the paginator, and the stream slicer.

For each stream slice, submit requests until there are no more pages of records to fetch.

This retriever currently inherits from HttpStream to reuse the request submission and pagination machinery. As a result, some of the parameters passed to some methods are unused. The two will be decoupled in a future release.

stream_name

The stream’s name

Type

str

stream_primary_key

The stream’s primary key

Type

Optional[Union[str, List[str], List[List[str]]]]

requester

The HTTP requester

Type

Requester

record_selector

The record selector

Type

HttpSelector

paginator

The paginator

Type

Optional[Paginator]

stream_slicer

The stream slicer

Type

Optional[StreamSlicer]

cursor

The cursor

Type

Optional[cursor]

parameters

Additional runtime parameters to be used for string interpolation

Type

Mapping[str, Any]

config: Mapping[str, Any]
cursor: Optional[airbyte_cdk.sources.declarative.incremental.cursor.Cursor] = None
must_deduplicate_query_params() bool
property name: str

Stream name

Type

return

paginator: Optional[airbyte_cdk.sources.declarative.requesters.paginators.paginator.Paginator] = None
parameters: dataclasses.InitVar[Mapping[str, Any]]
property primary_key: Optional[Union[str, List[str], List[List[str]]]]

The stream’s primary key

read_records(records_schema: Mapping[str, Any], stream_slice: Optional[Mapping[str, Any]] = None) Iterable[Union[Mapping[str, Any], airbyte_protocol.models.airbyte_protocol.AirbyteMessage]]

Fetch a stream’s records from an HTTP API source

Parameters
  • records_schema – json schema to describe record

  • stream_slice – The stream slice to read data for

Returns

The records read from the API source

record_selector: airbyte_cdk.sources.declarative.extractors.http_selector.HttpSelector
requester: airbyte_cdk.sources.declarative.requesters.requester.Requester
property state: Mapping[str, Any]

State getter, should return state in form that can serialized to a string and send to the output as a STATE AirbyteMessage.

A good example of a state is a cursor_value:
{

self.cursor_field: “cursor_value”

}

State should try to be as small as possible but at the same time descriptive enough to restore syncing process from the point where it stopped.

stream_slicer: airbyte_cdk.sources.declarative.stream_slicers.stream_slicer.StreamSlicer = SinglePartitionRouter()
stream_slices() Iterable[Optional[Mapping[str, Any]]]

Specifies the slices for this stream. See the stream slicing section of the docs for more information.

Parameters
  • sync_mode

  • cursor_field

  • stream_state

Returns

class airbyte_cdk.sources.declarative.retrievers.SimpleRetrieverTestReadDecorator(requester: airbyte_cdk.sources.declarative.requesters.requester.Requester, record_selector: airbyte_cdk.sources.declarative.extractors.http_selector.HttpSelector, config: typing.Mapping[str, typing.Any], parameters: dataclasses.InitVar[typing.Mapping[str, typing.Any]], name: str = <property object>, primary_key: typing.Optional[typing.Union[str, typing.List[str], typing.List[typing.List[str]]]] = <property object>, paginator: typing.Optional[airbyte_cdk.sources.declarative.requesters.paginators.paginator.Paginator] = None, stream_slicer: airbyte_cdk.sources.declarative.stream_slicers.stream_slicer.StreamSlicer = SinglePartitionRouter(), cursor: typing.Optional[airbyte_cdk.sources.declarative.incremental.cursor.Cursor] = None, maximum_number_of_slices: int = 5)

Bases: airbyte_cdk.sources.declarative.retrievers.simple_retriever.SimpleRetriever

In some cases, we want to limit the number of requests that are made to the backend source. This class allows for limiting the number of slices that are queried throughout a read command.

config: Config
maximum_number_of_slices: int = 5
parameters: InitVar[Mapping[str, Any]]
record_selector: HttpSelector
requester: Requester
stream_slices() Iterable[Optional[Mapping[str, Any]]]

Specifies the slices for this stream. See the stream slicing section of the docs for more information.

Parameters
  • sync_mode

  • cursor_field

  • stream_state

Returns