Configuration Reference

Required Settings

id

type:

str

A string uniquely identifying the app, shared across all instances such that two app instances with the same id are considered to be in the same “group”.

This parameter is required.

The id and Kafka

When using Kafka, the id is used to generate app-local topics, and names for consumer groups.

Commonly Used Settings

autodiscover

type:

Any

default:

False

Automatic discovery of agents, tasks, timers, views and commands.

Faust has an API to add different asyncio services and other user extensions, such as “Agents”, HTTP web views, command-line commands, and timers to your Faust workers. These can be defined in any module, so to discover them at startup, the worker needs to traverse packages looking for them.

Warning

The autodiscovery functionality uses the Venusian library to scan wanted packages for @app.agent, @app.page, @app.command, @app.task and @app.timer decorators, but to do so, it’s required to traverse the package path and import every module in it.

Importing random modules like this can be dangerous so make sure you follow Python programming best practices. Do not start threads; perform network I/O; do test monkey-patching for mocks or similar, as a side effect of importing a module. If you encounter a case such as this then please find a way to perform your action in a lazy manner.

Warning

If the above warning is something you cannot fix, or if it’s out of your control, then please set autodiscover=False and make sure the worker imports all modules where your decorators are defined.

The value for this argument can be:

bool

If App(autodiscover=True) is set, the autodiscovery will scan the package name described in the origin attribute.

The origin attribute is automatically set when you start a worker using the faust command line program, for example:

faust -A example.simple worker

The -A, option specifies the app, but you can also create a shortcut entry point by calling app.main():

if __name__ == '__main__':
    app.main()

Then you can start the faust program by executing for example python myscript.py worker --loglevel=INFO, and it will use the correct application.

Sequence[str]

The argument can also be a list of packages to scan:

app = App(..., autodiscover=['proj_orders', 'proj_accounts'])
Callable[[], Sequence[str]]

The argument can also be a function returning a list of packages to scan:

def get_all_packages_to_scan():
    return ['proj_orders', 'proj_accounts']

app = App(..., autodiscover=get_all_packages_to_scan)
False

If everything you need is in a self-contained module, or you import the stuff you need manually, just set autodiscover to False and don’t worry about it :-)

Django

When using Django and the DJANGO_SETTINGS_MODULE environment variable is set, the Faust app will scan all packages found in the INSTALLED_APPS setting.

If you’re using Django you can use this to scan for agents/pages/commands in all packages defined in INSTALLED_APPS.

Faust will automatically detect that you’re using Django and do the right thing if you do:

app = App(..., autodiscover=True)

It will find agents and other decorators in all of the reusable Django applications. If you want to manually control what packages are traversed, then provide a list:

app = App(..., autodiscover=['package1', 'package2'])

or if you want exactly None packages to be traversed, then provide a False:

app = App(.., autodiscover=False)

which is the default, so you can simply omit the argument.

Tip

For manual control over autodiscovery, you can also call the app.discover() method manually.

datadir

type:

str / Path

default:

'{conf.name}-data'

environment:

APP_DATADIR

related-command-options:

faust --data-dir

Application data directory.

The directory in which this instance stores the data used by local tables, etc.

See also

  • The data directory can also be set using the faust --datadir option, from the command-line, so there is usually no reason to provide a default value when creating the app.

tabledir

type:

str / Path

default:

'tables'

environment:

APP_TABLEDIR

Application table data directory.

The directory in which this instance stores local table data. Usually you will want to configure the datadir setting, but if you want to store tables separately you can configure this one.

If the path provided is relative (it has no leading slash), then the path will be considered to be relative to the datadir setting.

debug

type:

bool

default:

False

environment:

APP_DEBUG

related-command-options:

faust --debug

Use in development to expose sensor information endpoint.

Tip

If you want to enable the sensor statistics endpoint in production, without enabling the debug setting, you can do so by adding the following code:

app.web.blueprints.add(
    '/stats/', 'faust.web.apps.stats:blueprint')

env_prefix

New in version 1.11.

type:

str

default:

None

environment:

APP_ENV_PREFIX

Environment variable prefix.

When configuring Faust by environent variables, this adds a common prefix to all Faust environment value names.

id_format

type:

str

default:

'{id}-v{self.version}'

environment:

APP_ID_FORMAT

Application ID format template.

The format string used to generate the final id value by combining it with the version parameter.

origin

type:

str

default:

None

The reverse path used to find the app.

For example if the app is located in:

from myproj.app import app

Then the origin should be "myproj.app".

The faust worker program will try to automatically set the origin, but if you are having problems with auto generated names then you can set origin manually.

timezone

New in version 1.4.

type:

tzinfo

default:

datetime.timezone.utc

environment:

TIMEZONE

Project timezone.

The timezone used for date-related functionality such as cronjobs.

version

type:

int

default:

1

environment:

APP_VERSION

App version.

Version of the app, that when changed will create a new isolated instance of the application. The first version is 1, the second version is 2, and so on.

Source topics will not be affected by a version change.

Faust applications will use two kinds of topics: source topics, and internally managed topics. The source topics are declared by the producer, and we do not have the opportunity to modify any configuration settings, like number of partitions for a source topic; we may only consume from them. To mark a topic as internal, use: app.topic(..., internal=True).

blocking_timeout

type:

float / timedelta

default:

None

environment:

BLOCKING_TIMEOUT

related-command-options:

faust --blocking-timeout

Blocking timeout (in seconds).

When specified the worker will start a periodic signal based timer that only triggers when the loop has been blocked for a time exceeding this timeout.

This is the most safe way to detect blocking, but could have adverse effects on libraries that do not automatically retry interrupted system calls.

Python itself does retry all interrupted system calls since version 3.5 (see PEP 475), but this might not be the case with C extensions added to the worker by the user.

The blocking detector is a background thread that periodically wakes up to either arm a timer, or cancel an already armed timer. In pseudocode:

while True:
    # cancel previous alarm and arm new alarm
    signal.signal(signal.SIGALRM, on_alarm)
    signal.setitimer(signal.ITIMER_REAL, blocking_timeout)
    # sleep to wakeup just before the timeout
    await asyncio.sleep(blocking_timeout * 0.96)

def on_alarm(signum, frame):
    logger.warning('Blocking detected: ...')

If the sleep does not wake up in time the alarm signal will be sent to the process and a traceback will be logged.

broker

type:

str / URL / [ str ]

default:

None

environment:

BROKER_URL

Broker URL, or a list of alternative broker URLs.

Faust needs the URL of a “transport” to send and receive messages.

Currently, the only supported production transport is kafka://. This uses the aiokafka client under the hood, for consuming and producing messages.

You can specify multiple hosts at the same time by separating them using the semi-comma:

kafka://kafka1.example.com:9092;kafka2.example.com:9092

Which in actual code looks like this:

BROKERS = 'kafka://kafka1.example.com:9092;kafka2.example.com:9092'
app = faust.App(
    'id',
    broker=BROKERS,
)

You can also pass a list of URLs:

app = faust.App(
    'id',
    broker=['kafka://kafka1.example.com:9092',
            'kafka://kafka2.example.com:9092'],
)

See also

You can configure the transport used for consuming and producing separately, by setting the broker_consumer and broker_producer settings.

This setting is used as the default.

Available Transports

  • kafka://

    Alias to aiokafka://

  • aiokafka://

    The recommended transport using the aiokafka client.

    Limitations: None

broker_credentials

New in version 1.5.

type:

CredentialsT

default:

None

environment:

BROKER_CREDENTIALS

Broker authentication mechanism.

Specify the authentication mechanism to use when connecting to the broker.

The default is to not use any authentication.

SASL Authentication

You can enable SASL authentication via plain text:

app = faust.App(
    broker_credentials=faust.SASLCredentials(
        username='x',
        password='y',
    ))

Warning

Do not use literal strings when specifying passwords in production, as they can remain visible in stack traces.

Instead the best practice is to get the password from a configuration file, or from the environment:

BROKER_USERNAME = os.environ.get('BROKER_USERNAME')
BROKER_PASSWORD = os.environ.get('BROKER_PASSWORD')

app = faust.App(
    broker_credentials=faust.SASLCredentials(
        username=BROKER_USERNAME,
        password=BROKER_PASSWORD,
    ))
OAuth2 Authentication

You can enable SASL authentication via OAuth2 Bearer tokens:

   import faust
   from asyncio import get_running_loop
   from aiokafka.helpers import create_ssl_context
   from aiokafka.conn import AbstractTokenProvider

   class TokenProvider(AbstractTokenProvider):
       async def token(self):
           return await get_running_loop().run_in_executor(
               None, self.get_token)

       def get_token(self):
           return 'token'

    app = faust.App(
        broker_credentials=faust.OAuthCredentials(
            oauth_cb=TokenProvider()
            ssl_context=create_ssl_context()
        )
    )

.. info::

    The implementation should ensure token reuse so that multiple
    calls at connect time do not create multiple tokens.
    The implementation should also periodically refresh the token in order to
    guarantee that each call returns an unexpired token.

    Token Providers MUST implement the :meth:`token` method
GSSAPI Authentication

GSSAPI authentication over plain text:

app = faust.App(
    broker_credentials=faust.GSSAPICredentials(
        kerberos_service_name='faust',
        kerberos_domain_name='example.com',
    ),
)

GSSAPI authentication over SSL:

import ssl
ssl_context = ssl.create_default_context(
    purpose=ssl.Purpose.SERVER_AUTH, cafile='ca.pem')
ssl_context.load_cert_chain(
    'client.cert', keyfile='client.key')

app = faust.App(
    broker_credentials=faust.GSSAPICredentials(
        kerberos_service_name='faust',
        kerberos_domain_name='example.com',
        ssl_context=ssl_context,
    ),
)
SSL Authentication

Provide an SSL context for the Kafka broker connections.

This allows Faust to use a secure SSL/TLS connection for the Kafka connections and enabling certificate-based authentication.

import ssl

ssl_context = ssl.create_default_context(
    purpose=ssl.Purpose.SERVER_AUTH, cafile='ca.pem')
ssl_context.load_cert_chain(
    'client.cert', keyfile='client.key')
app = faust.App(..., broker_credentials=ssl_context)

ssl_context

type:

SSLContext

default:

None

SSL configuration.

See credentials.

logging_config

New in version 1.5.

type:

dict

default:

None

Logging dictionary configuration.

Optional dictionary for logging configuration, as supported by logging.config.dictConfig().

loghandlers

type:

[ Handler ]

default:

None

List of custom logging handlers.

Specify a list of custom log handlers to use in worker instances.

processing_guarantee

New in version 1.5.

type:

str

default:

<ProcessingGuarantee.AT_LEAST_ONCE: 'at_least_once'>

environment:

PROCESSING_GUARANTEE

The processing guarantee that should be used.

Possible values are “at_least_once” (default) and “exactly_once”.

Note that if exactly-once processing is enabled consumers are configured with isolation.level="read_committed" and producers are configured with retries=Integer.MAX_VALUE and enable.idempotence=true per default.

Note that by default exactly-once processing requires a cluster of at least three brokers what is the recommended setting for production. For development you can change this, by adjusting broker setting transaction.state.log.replication.factor to the number of brokers you want to use.

store

type:

str / URL

default:

'memory://'

environment:

APP_STORE

Table storage backend URL.

The backend used for table storage.

Tables are stored in-memory by default, but you should not use the memory:// store in production.

In production, a persistent table store, such as rocksdb:// is preferred.

cache

New in version 1.2.

type:

str / URL

default:

'memory://'

environment:

CACHE_URL

Cache backend URL.

Optional backend used for Memcached-style caching. URL can be:

  • redis://host

  • rediscluster://host, or

  • memory://.

Advanced Agent Settings

agent_supervisor

type:

str / Type

default:

'mode.OneForOneSupervisor'

environment:

AGENT_SUPERVISOR

Default agent supervisor type.

An agent may start multiple instances (actors) when the concurrency setting is higher than one (e.g. @app.agent(concurrency=2)).

Multiple instances of the same agent are considered to be in the same supervisor group.

The default supervisor is the mode.OneForOneSupervisor: if an instance in the group crashes, we restart that instance only.

These are the supervisors supported:

Advanced Broker Settings

broker_consumer

New in version 1.7.

type:

str / URL / [ str ]

default (alias to setting):

broker

environment:

BROKER_CONSUMER_URL

Consumer broker URL.

You can use this setting to configure the transport used for producing and consuming separately.

If not set the value found in broker will be used.

broker_producer

New in version 1.7.

type:

str / URL / [ str ]

default (alias to setting):

broker

environment:

BROKER_PRODUCER_URL

Producer broker URL.

You can use this setting to configure the transport used for producing and consuming separately.

If not set the value found in broker will be used.

broker_api_version

New in version 1.10.

type:

str

default:

'auto'

environment:

BROKER_API_VERSION

Broker API version,.

This setting is also the default for consumer_api_version, and producer_api_version.

Negotiate producer protocol version.

The default value - “auto” means use the latest version supported by both client and server.

Any other version set means you are requesting a specific version of the protocol.

Example Kafka uses:

Disable sending headers for all messages produced

Kafka headers support was added in Kafka 0.11, so you can specify broker_api_version="0.10" to remove the headers from messages.

broker_check_crcs

type:

bool

default:

True

environment:

BROKER_CHECK_CRCS

Broker CRC check.

Automatically check the CRC32 of the records consumed.

broker_client_id

type:

str

default:

'faust-0.8.9'

environment:

BROKER_CLIENT_ID

Broker client ID.

There is rarely any reason to configure this setting.

The client id is used to identify the software used, and is not usually configured by the user.

broker_commit_every

type:

int

default:

10000

environment:

BROKER_COMMIT_EVERY

Broker commit message frequency.

Commit offset every n messages.

See also broker_commit_interval, which is how frequently we commit on a timer when there are few messages being received.

broker_commit_interval

type:

float / timedelta

default:

2.8

environment:

BROKER_COMMIT_INTERVAL

Broker commit time frequency.

How often we commit messages that have been fully processed (acked).

broker_commit_livelock_soft_timeout

type:

float / timedelta

default:

300.0

environment:

BROKER_COMMIT_LIVELOCK_SOFT_TIMEOUT

Commit livelock timeout.

How long time it takes before we warn that the Kafka commit offset has not advanced (only when processing messages).

broker_heartbeat_interval

New in version 1.0.11.

type:

float / timedelta

default:

3.0

environment:

BROKER_HEARTBEAT_INTERVAL

Broker heartbeat interval.

How often we send heartbeats to the broker, and also how often we expect to receive heartbeats from the broker.

If any of these time out, you should increase this setting.

broker_max_poll_interval

New in version 1.7.

type:

float / timedelta

default:

1000.0

environment:

BROKER_MAX_POLL_INTERVAL

Broker max poll interval.

The maximum allowed time (in seconds) between calls to consume messages If this interval is exceeded the consumer is considered failed and the group will rebalance in order to reassign the partitions to another consumer group member. If API methods block waiting for messages, that time does not count against this timeout.

See KIP-62 for technical details.

broker_max_poll_records

New in version 1.4.

type:

int

default:

None

environment:

BROKER_MAX_POLL_RECORDS

Broker max poll records.

The maximum number of records returned in a single call to poll(). If you find that your application needs more time to process messages you may want to adjust broker_max_poll_records to tune the number of records that must be handled on every loop iteration.

broker_rebalance_timeout

New in version 1.10.

type:

float / timedelta

default:

60.0

environment:

BROKER_REBALANCE_TIMEOUT

Broker rebalance timeout.

How long to wait for a node to finish rebalancing before the broker will consider it dysfunctional and remove it from the cluster.

Increase this if you experience the cluster being in a state of constantly rebalancing, but make sure you also increase the broker_heartbeat_interval at the same time.

Note

The session timeout must not be greater than the broker_request_timeout.

broker_request_timeout

New in version 1.4.

type:

float / timedelta

default:

90.0

environment:

BROKER_REQUEST_TIMEOUT

Kafka client request timeout.

Note

The request timeout must not be less than the broker_session_timeout.

broker_session_timeout

New in version 1.0.11.

type:

float / timedelta

default:

60.0

environment:

BROKER_SESSION_TIMEOUT

Broker session timeout.

How long to wait for a node to finish rebalancing before the broker will consider it dysfunctional and remove it from the cluster.

Increase this if you experience the cluster being in a state of constantly rebalancing, but make sure you also increase the broker_heartbeat_interval at the same time.

Note

The session timeout must not be greater than the broker_request_timeout.

Advanced Consumer Settings

consumer_api_version

New in version 1.10.

type:

str

default (alias to setting):

broker_api_version

environment:

CONSUMER_API_VERSION

Consumer API version.

Configures the broker API version to use for consumers. See broker_api_version for more information.

consumer_max_fetch_size

New in version 1.4.

type:

int

default:

1048576

environment:

CONSUMER_MAX_FETCH_SIZE

Consumer max fetch size.

The maximum amount of data per-partition the server will return. This size must be at least as large as the maximum message size.

Note: This is PER PARTITION, so a limit of 1Mb when your workers consume from 10 topics having 100 partitions each, means a fetch request can be up to a gigabyte (10 * 100 * 1Mb), This limit being too generous may cause rebalancing issues: if the amount of time required to flush pending data stuck in socket buffers exceed the rebalancing timeout.

You must keep this limit low enough to account for many partitions being assigned to a single node.

consumer_auto_offset_reset

New in version 1.5.

type:

str

default:

'earliest'

environment:

CONSUMER_AUTO_OFFSET_RESET

Consumer auto offset reset.

Where the consumer should start reading messages from when there is no initial offset, or the stored offset no longer exists, e.g. when starting a new consumer for the first time.

Options include ‘earliest’, ‘latest’, ‘none’.

consumer_group_instance_id

New in version 2.1.

type:

str

default:

None

environment:

CONSUMER_GROUP_INSTANCE_ID

Consumer group instance id.

The group_instance_id for static partition assignment.

If not set, default assignment strategy is used. Otherwise, each consumer instance has to have a unique id.

consumer_metadata_max_age_ms

New in version 0.8.5.

type:

int

default:

300000

environment:

CONSUMER_METADATA_MAX_AGE_MS

Consumer metadata max age milliseconds

The period of time in milliseconds after which we force a refresh of metadata even if we haven’t seen any partition leadership changes to proactively discover any new brokers or partitions.

Default: 300000

consumer_connections_max_idle_ms

New in version 0.8.5.

type:

int

default:

540000

environment:

CONSUMER_CONNECTIONS_MAX_IDLE_MS

Consumer connections max idle milliseconds.

Close idle connections after the number of milliseconds specified by this config.

Default: 540000 (9 minutes).

ConsumerScheduler

New in version 1.5.

type:

str / Type

default:

'faust.transport.utils:DefaultSchedulingStrategy'

Consumer scheduler class.

A strategy which dictates the priority of topics and partitions for incoming records. The default strategy does first round-robin over topics and then round-robin over partitions.

Example using a class:

class MySchedulingStrategy(DefaultSchedulingStrategy):
    ...

app = App(..., ConsumerScheduler=MySchedulingStrategy)

Example using the string path to a class:

app = App(..., ConsumerScheduler='myproj.MySchedulingStrategy')

Serialization Settings

key_serializer

type:

str / CodecT

default:

'raw'

environment:

APP_KEY_SERIALIZER

Default key serializer.

Serializer used for keys by default when no serializer is specified, or a model is not being used.

This can be the name of a serializer/codec, or an actual faust.serializers.codecs.Codec instance.

See also

  • The Codecs section in the model guide – for more information about codecs.

value_serializer

type:

str / CodecT

default:

'json'

environment:

APP_VALUE_SERIALIZER

Default value serializer.

Serializer used for values by default when no serializer is specified, or a model is not being used.

This can be string, the name of a serializer/codec, or an actual faust.serializers.codecs.Codec instance.

See also

  • The Codecs section in the model guide – for more information about codecs.

Advanced Producer Settings

producer_acks

type:

int

default:

-1

environment:

PRODUCER_ACKS

Producer Acks.

The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are common:

  • 0: Producer will not wait for any acknowledgment from

    the server at all. The message will immediately be considered sent (Not recommended).

  • 1: The broker leader will write the record to its local

    log but will respond without awaiting full acknowledgment from all followers. In this case should the leader fail immediately after acknowledging the record but before the followers have replicated it then the record will be lost.

  • -1: The broker leader will wait for the full set of in-sync

    replicas to acknowledge the record. This guarantees that the record will not be lost as long as at least one in-sync replica remains alive. This is the strongest available guarantee.

producer_api_version

New in version 1.5.3.

type:

str

default (alias to setting):

broker_api_version

environment:

PRODUCER_API_VERSION

Producer API version.

Configures the broker API version to use for producers. See broker_api_version for more information.

producer_compression_type

type:

str

default:

None

environment:

PRODUCER_COMPRESSION_TYPE

Producer compression type.

The compression type for all data generated by the producer. Valid values are gzip, snappy, lz4, or None.

producer_linger

type:

float / timedelta

default:

None

environment:

PRODUCER_LINGER

Producer batch linger configuration.

Minimum time to batch before sending out messages from the producer.

Should rarely have to change this.

producer_max_batch_size

type:

int

default:

16384

environment:

PRODUCER_MAX_BATCH_SIZE

Producer max batch size.

Max size of each producer batch, in bytes.

producer_max_request_size

type:

int

default:

1000000

environment:

PRODUCER_MAX_REQUEST_SIZE

Producer maximum request size.

Maximum size of a request in bytes in the producer.

Should rarely have to change this.

producer_partitioner

New in version 1.2.

type:

str / Type

default:

None

Producer partitioning strategy.

The Kafka producer can be configured with a custom partitioner to change how keys are partitioned when producing to topics.

The default partitioner for Kafka is implemented as follows, and can be used as a template for your own partitioner:

import random
from typing import List
from kafka.partitioner.hashed import murmur2

def partition(key: bytes,
            all_partitions: List[int],
            available: List[int]) -> int:
    '''Default partitioner.

    Hashes key to partition using murmur2 hashing
    (from java client) If key is None, selects partition
    randomly from available, or from all partitions if none
    are currently available

    Arguments:
        key: partitioning key
        all_partitions: list of all partitions sorted by
                        partition ID.
        available: list of available partitions
                   in no particular order
    Returns:
        int: one of the values from ``all_partitions``
             or ``available``.
    '''
    if key is None:
        source = available if available else all_paritions
        return random.choice(source)
    index: int = murmur2(key)
    index &= 0x7fffffff
    index %= len(all_partitions)
    return all_partitions[index]

producer_request_timeout

New in version 1.4.

type:

float / timedelta

default:

1200.0

environment:

PRODUCER_REQUEST_TIMEOUT

Producer request timeout.

Timeout for producer operations. This is set high by default, as this is also the time when producer batches expire and will no longer be retried.

producer_threaded

New in version 0.4.5.

type:

bool

default:

False

environment:

PRODUCER_THREADED

Thread separate producer for send_soon.

If True, spin up a different producer in a different thread to be used for messages buffered up for producing via send_soon function.

producer_metadata_max_age_ms

New in version 0.8.5.

type:

int

default:

300000

environment:

PRODUCER_METADATA_MAX_AGE_MS

Producer metadata max age milliseconds

The period of time in milliseconds after which we force a refresh of metadata even if we haven’t seen any partition leadership changes to proactively discover any new brokers or partitions.

Default: 300000

producer_connections_max_idle_ms

New in version 0.8.5.

type:

int

default:

540000

environment:

PRODUCER_CONNECTIONS_MAX_IDLE_MS

Producer connections max idle milliseconds.

Close idle connections after the number of milliseconds specified by this config.

Default: 540000 (9 minutes).

Advanced Stream Settings

recovery_consistency_check

New in version 0.4.7.

type:

bool

default:

True

environment:

RECOVERY_CONSISTENCY_CHECK

Check Kafka and local offsets for consistency.

If True, assert that Kafka highwater offsets >= local offset in the rocksdb state storee

store_check_exists

New in version 0.6.0.

type:

bool

default:

True

environment:

STORE_CHECK_EXISTS

Execute exists on the underlying store.

If True, executes exists on the underlying store. If False client has to catch KeyError

crash_app_on_aerospike_exception

New in version 0.6.3.

type:

bool

default:

True

environment:

CRASH_APP_ON_AEROSPIKE_EXCEPTION

Crashes the app on an aerospike Exceptions.

If True, crashes the app and prevents the commit offset on progressing. If False client has to catch the Error and implement a dead letter queue

aerospike_retries_on_exception

New in version 0.6.10.

type:

int

default:

60

environment:

AEROSPIKE_RETRIES_ON_EXCEPTION

Number of retries to aerospike on a runtime error from the aerospike client.

Set this to the number of retries using the aerospike client on a runtime Exception thrown by the client

aerospike_sleep_seconds_between_retries_on_exception

New in version 0.6.10.

type:

int

default:

1

environment:

AEROSPIKE_SLEEP_SECONDS_BETWEEN_RETRIES_ON_EXCEPTION

Seconds to sleep between retries to aerospike on a runtime error from the aerospike client.

Set this to the sleep in seconds between retries using the aerospike client on a runtime Exception thrown by the client

stream_buffer_maxsize

type:

int

default:

4096

environment:

STREAM_BUFFER_MAXSIZE

Stream buffer maximum size.

This setting control back pressure to streams and agents reading from streams.

If set to 4096 (default) this means that an agent can only keep at most 4096 unprocessed items in the stream buffer.

Essentially this will limit the number of messages a stream can “prefetch”.

Higher numbers gives better throughput, but do note that if your agent sends messages or update tables (which sends changelog messages).

This means that if the buffer size is large, the broker_commit_interval or broker_commit_every settings must be set to commit frequently, avoiding back pressure from building up.

A buffer size of 131_072 may let you process over 30,000 events a second as a baseline, but be careful with a buffer size that large when you also send messages or update tables.

stream_processing_timeout

New in version 1.10.

type:

float / timedelta

default:

300.0

environment:

STREAM_PROCESSING_TIMEOUT

Stream processing timeout.

Timeout (in seconds) for processing events in the stream. If processing of a single event exceeds this time we log an error, but do not stop processing.

If you are seeing a warning like this you should either

  1. increase this timeout to allow agents to spend more time on a single event, or

  2. add a timeout to the operation in the agent, so stream processing always completes before the timeout.

The latter is preferred for network operations such as web requests. If a network service you depend on is temporarily offline you should consider doing retries (send to separate topic):

main_topic = app.topic('main')
deadletter_topic = app.topic('main_deadletter')

async def send_request(value, timeout: Optional[float] = None) -> None:
    await app.http_client.get('http://foo.com', timeout=timeout)

@app.agent(main_topic)
async def main(stream):
    async for value in stream:
    try:
        await send_request(value, timeout=5)
    except asyncio.TimeoutError:
        await deadletter_topic.send(value)

@app.agent(deadletter_topic)
    async def main_deadletter(stream):
        async for value in stream:
        # wait for 30 seconds before retrying.
        await stream.sleep(30)
        await send_request(value)

stream_publish_on_commit

type:

bool

default:

False

Stream delay producing until commit time.

If enabled we buffer up sending messages until the source topic offset related to that processing is committed. This means when we do commit, we may have buffered up a LOT of messages so commit needs to happen frequently (make sure to decrease broker_commit_every).

stream_recovery_delay

New in version 1.3.

Changed in version 1.5.3: Disabled by default.

type:

float / timedelta

default:

0.0

environment:

STREAM_RECOVERY_DELAY

Stream recovery delayl

Number of seconds to sleep before continuing after rebalance. We wait for a bit to allow for more nodes to join/leave before starting recovery tables and then processing streams. This to minimize the chance of errors rebalancing loops.

stream_wait_empty

type:

bool

default:

True

environment:

STREAM_WAIT_EMPTY

Stream wait empty.

This setting controls whether the worker should wait for the currently processing task in an agent to complete before rebalancing or shutting down.

On rebalance/shut down we clear the stream buffers. Those events will be reprocessed after the rebalance anyway, but we may have already started processing one event in every agent, and if we rebalance we will process that event again.

By default we will wait for the currently active tasks, but if your streams are idempotent you can disable it using this setting.

Agent RPC Settings

reply_create_topic

type:

bool

default:

False

environment:

APP_REPLY_CREATE_TOPIC

Automatically create reply topics.

Set this to True if you plan on using the RPC with agents.

This will create the internal topic used for RPC replies on that instance at startup.

reply_expires

type:

float / timedelta

default:

86400.0

environment:

APP_REPLY_EXPIRES

RPC reply expiry time in seconds.

The expiry time (in seconds float, or timedelta), for how long replies will stay in the instances local reply topic before being removed.

reply_to

type:

str

default:

None

Reply to address.

The name of the reply topic used by this instance. If not set one will be automatically generated when the app is created.

reply_to_prefix

type:

str

default:

'f-reply-'

environment:

APP_REPLY_TO_PREFIX

Reply address topic name prefix.

The prefix used when generating reply topic names.

Advanced Table Settings

table_cleanup_interval

type:

float / timedelta

default:

30.0

environment:

TABLE_CLEANUP_INTERVAL

Table cleanup interval.

How often we cleanup tables to remove expired entries.

table_key_index_size

New in version 1.7.

type:

int

default:

1000

environment:

TABLE_KEY_INDEX_SIZE

Table key index size.

Tables keep a cache of key to partition number to speed up table lookups.

This setting configures the maximum size of that cache.

table_standby_replicas

type:

int

default:

1

environment:

TABLE_STANDBY_REPLICAS

Table standby replicas.

The number of standby replicas for each table.

Topic Settings

topic_allow_declare

New in version 1.5.

type:

bool

default:

True

environment:

TOPIC_ALLOW_DECLARE

Allow creating new topics.

This setting disables the creation of internal topics.

Faust will only create topics that it considers to be fully owned and managed, such as intermediate repartition topics, table changelog topics etc.

Some Kafka managers does not allow services to create topics, in that case you should set this to False.

topic_disable_leader

New in version 1.7.

type:

bool

default:

False

environment:

TOPIC_DISABLE_LEADER

Disable leader election topic.

This setting disables the creation of the leader election topic.

If you’re not using the on_leader=True argument to task/timer/etc., decorators then use this setting to disable creation of the topic.

topic_partitions

type:

int

default:

8

environment:

TOPIC_PARTITIONS

Topic partitions.

Default number of partitions for new topics.

Note

This defines the maximum number of workers we could distribute the workload of the application (also sometimes referred as the sharding factor of the application).

topic_replication_factor

type:

int

default:

1

environment:

TOPIC_REPLICATION_FACTOR

Topic replication factor.

The default replication factor for topics created by the application.

Note

Generally this should be the same as the configured replication factor for your Kafka cluster.

Advanced Web Server Settings

web

New in version 1.2.

type:

str / URL

default:

'aiohttp://'

Web server driver to use.

web_bind

New in version 1.2.

type:

str

default:

'0.0.0.0'

environment:

WEB_BIND

related-command-options:

faust worker --web-bind

Web network interface binding mask.

The IP network address mask that decides what interfaces the web server will bind to.

By default this will bind to all interfaces.

This option is usually set by faust worker --web-bind, not by passing it as a keyword argument to app.

web_cors_options

New in version 1.5.

type:

dict

default:

None

Cross Origin Resource Sharing options.

Enable Cross-Origin Resource Sharing options for all web views in the internal web server.

This should be specified as a dictionary of URLs to ResourceOptions:

app = App(..., web_cors_options={
    'http://foo.example.com': ResourceOptions(
        allow_credentials=True,
        allow_methods='*'k,
    )
})

Individual views may override the CORS options used as arguments to to @app.page and blueprint.route.

web_enabled

New in version 1.2.

type:

bool

default:

True

environment:

APP_WEB_ENABLED

related-command-options:

faust worker --with-web

Enable/disable internal web server.

Enable web server and other web components.

This option can also be set using faust worker --without-web.

web_host

New in version 1.2.

type:

str

default (template):

'{conf.NODE_HOSTNAME}'

environment:

WEB_HOST

related-command-options:

faust worker --web-host

Web server host name.

Hostname used to access this web server, used for generating the canonical_url setting.

This option is usually set by faust worker --web-host, not by passing it as a keyword argument to app.

web_in_thread

New in version 1.5.

type:

bool

default:

False

Run the web server in a separate thread.

Use this if you have a large value for stream_buffer_maxsize and want the web server to be responsive when the worker is otherwise busy processing streams.

Note

Running the web server in a separate thread means web views and agents will not share the same event loop.

web_port

New in version 1.2.

type:

int

default:

6066

environment:

WEB_PORT

related-command-options:

faust worker --web-port

Web server port.

A port number between 1024 and 65535 to use for the web server.

This option is usually set by faust worker --web-port, not by passing it as a keyword argument to app.

web_ssl_context

New in version 0.5.0.

type:

SSLContext

default:

None

Web server SSL configuration.

See credentials.

web_transport

New in version 1.2.

type:

str / URL

default:

URL('tcp:')

related-command-options:

faust worker --web-transport

Network transport used for the web server.

Default is to use TCP, but this setting also enables you to use Unix domainN sockets. To use domain sockets specify an URL including the path to the file you want to create like this:

unix:///tmp/server.sock

This will create a new domain socket available in /tmp/server.sock.

canonical_url

type:

str / URL

default (template):

'http://{conf.web_host}:{conf.web_port}'

environment:

NODE_CANONICAL_URL

related-command-options:

faust worker --web-host, faust worker --web-port

related-settings:

web_host, web_port

Node specific canonical URL.

You shouldn’t have to set this manually.

The canonical URL defines how to reach the web server on a running worker node, and is usually set by combining the web_host and web_port settings.

Advanced Worker Settings

worker_redirect_stdouts

type:

bool

default:

True

environment:

WORKER_REDIRECT_STDOUTS

Redirecting standard outputs.

Enable to have the worker redirect output to sys.stdout and sys.stderr to the Python logging system.

Enabled by default.

worker_redirect_stdouts_level

type:

str / int

default:

'WARN'

environment:

WORKER_REDIRECT_STDOUTS_LEVEL

Level used when redirecting standard outputs.

The logging level to use when redirect STDOUT/STDERR to logging.

Extension Settings

Agent

type:

str / Type

default:

'faust:Agent'

Agent class type.

The Agent class to use for agents, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MyAgent(faust.Agent):
    ...

app = App(..., Agent=MyAgent)

Example using the string path to a class:

app = App(..., Agent='myproj.agents.Agent')

Event

type:

str / Type

default:

'faust:Event'

Event class type.

The Event class to use for creating new event objects, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MyBaseEvent(faust.Event):
    ...

app = App(..., Event=MyBaseEvent)

Example using the string path to a class:

app = App(..., Event='myproj.events.Event')

Schema

type:

str / Type

default:

'faust:Schema'

Schema class type.

The Schema class to use as the default schema type when no schema specified. or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MyBaseSchema(faust.Schema):
    ...

app = App(..., Schema=MyBaseSchema)

Example using the string path to a class:

app = App(..., Schema='myproj.schemas.Schema')

Stream

type:

str / Type

default:

'faust:Stream'

Stream class type.

The Stream class to use for streams, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MyBaseStream(faust.Stream):
    ...

app = App(..., Stream=MyBaseStream)

Example using the string path to a class:

app = App(..., Stream='myproj.streams.Stream')

Table

type:

str / Type

default:

'faust:Table'

Table class type.

The Table class to use for tables, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MyBaseTable(faust.Table):
    ...

app = App(..., Table=MyBaseTable)

Example using the string path to a class:

app = App(..., Table='myproj.tables.Table')

SetTable

type:

str / Type

default:

'faust:SetTable'

SetTable extension table.

The SetTable class to use for table-of-set tables, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MySetTable(faust.SetTable):
    ...

app = App(..., Table=MySetTable)

Example using the string path to a class:

app = App(..., Table='myproj.tables.MySetTable')

GlobalTable

type:

str / Type

default:

'faust:GlobalTable'

GlobalTable class type.

The GlobalTable class to use for tables, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MyBaseGlobalTable(faust.GlobalTable):
    ...

app = App(..., GlobalTable=MyBaseGlobalTable)

Example using the string path to a class:

app = App(..., GlobalTable='myproj.tables.GlobalTable')

SetGlobalTable

type:

str / Type

default:

'faust:SetGlobalTable'

SetGlobalTable class type.

The SetGlobalTable class to use for tables, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

class MyBaseSetGlobalTable(faust.SetGlobalTable):
    ...

app = App(..., SetGlobalTable=MyBaseGlobalSetTable)

Example using the string path to a class:

app = App(..., SetGlobalTable='myproj.tables.SetGlobalTable')

TableManager

type:

str / Type

default:

'faust.tables:TableManager'

Table manager class type.

The TableManager used for managing tables, or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

from faust.tables import TableManager

class MyTableManager(TableManager):
    ...

app = App(..., TableManager=MyTableManager)

Example using the string path to a class:

app = App(..., TableManager='myproj.tables.TableManager')

Serializers

type:

str / Type

default:

'faust.serializers:Registry'

Serializer registry class type.

The Registry class used for serializing/deserializing messages; or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

from faust.serialiers import Registry

class MyRegistry(Registry):
    ...

app = App(..., Serializers=MyRegistry)

Example using the string path to a class:

app = App(..., Serializers='myproj.serializers.Registry')

Worker

type:

str / Type

default:

'faust.worker:Worker'

Worker class type.

The Worker class used for starting a worker for this app; or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

import faust

class MyWorker(faust.Worker):
    ...

app = faust.App(..., Worker=Worker)

Example using the string path to a class:

app = faust.App(..., Worker='myproj.workers.Worker')

PartitionAssignor

type:

str / Type

default:

'faust.assignor:PartitionAssignor'

Partition assignor class type.

The PartitionAssignor class used for assigning topic partitions to worker instances; or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

from faust.assignor import PartitionAssignor

class MyPartitionAssignor(PartitionAssignor):
    ...

app = App(..., PartitionAssignor=PartitionAssignor)

Example using the string path to a class:

app = App(..., Worker='myproj.assignor.PartitionAssignor')

LeaderAssignor

type:

str / Type

default:

'faust.assignor:LeaderAssignor'

Leader assignor class type.

The LeaderAssignor class used for assigning a master Faust instance for the app; or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

from faust.assignor import LeaderAssignor

class MyLeaderAssignor(LeaderAssignor):
    ...

app = App(..., LeaderAssignor=LeaderAssignor)

Example using the string path to a class:

app = App(..., Worker='myproj.assignor.LeaderAssignor')

Router

type:

str / Type

default:

'faust.app.router:Router'

Router class type.

The Router class used for routing requests to a worker instance having the partition for a specific key (e.g. table key); or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

from faust.router import Router

class MyRouter(Router):
    ...

app = App(..., Router=Router)

Example using the string path to a class:

app = App(..., Router='myproj.routers.Router')

Topic

type:

str / Type

default:

'faust:Topic'

Topic class type.

The Topic class used for defining new topics; or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

import faust

class MyTopic(faust.Topic):
    ...

app = faust.App(..., Topic=MyTopic)

Example using the string path to a class:

app = faust.App(..., Topic='myproj.topics.Topic')

HttpClient

type:

str / Type

default:

'aiohttp.client:ClientSession'

Http client class type

The aiohttp.client.ClientSession class used as a HTTP client; or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

import faust
from aiohttp.client import ClientSession

class HttpClient(ClientSession):
    ...

app = faust.App(..., HttpClient=HttpClient)

Example using the string path to a class:

app = faust.App(..., HttpClient='myproj.http.HttpClient')

Monitor

type:

str / Type

default:

'faust.sensors:Monitor'

Monitor sensor class type.

The Monitor class as the main sensor gathering statistics for the application; or the fully-qualified path to one (supported by symbol_by_name()).

Example using a class:

import faust
from faust.sensors import Monitor

class MyMonitor(Monitor):
    ...

app = faust.App(..., Monitor=MyMonitor)

Example using the string path to a class:

app = faust.App(..., Monitor='myproj.monitors.Monitor')