Отчет по заданию “Развертывание модели машинного обучения”

Разработано приложение для анализа сентимента входного текста с использованием FastAPI, Celery, Redis и RabbitMQ.

Приложение предоставляет асинхронное высоконагруженное API для взаимодействия с моделью машинного обучения. В примере используется предобученная модель Трансформер для предсказания сентимента для входного текста, однако может быть использована любая ML-модель, т.к. скелет приложения является универсальным.

Репозиторий

Приложение является самостоятельным “проектом-в-проекте”. Приоритетная поддерживаемая версия размещена в отдельном репозитории “ML Service with FastAPI, Celery, Redis and RabbitMQ”.

В текущем репозитории в директории /mlservice размещен дубликат некоторой рабочей версии. При необходимости использования приложения из данного репозитории (не рекомендуется) сделайте вначале cd mlservice.

Структура проекта

cd mlservice

├── docker-compose.yml          # Docker container managing
├── pyproject.toml              # Dependencies
└── src
    ├── app.py                  # Main app with FastAPI initializing
    ├── constansts.py           # Global app's constants
    ├── Dockerfile              # Docker container for app and worker
    ├── celery                  # Package with Celery and its Worker
    │   ├── start.py            # Celery initializing
    │   └── worker.py           # Celery worker
    ├── schemas                 # Package with data models
    │   ├── healthcheck.py      # Schema for service health state responses
    │   └── prediction.py       # Schema for input requests to the API
    └── services                # Package with ML model and services
        ├── lifespan.py         # At startup and at completion logging
        └── model.py            # ML model with sentiment prediction

Операционная логика

  • Клиент отправляет HTTP-запрос с текстовыми данными в json на асинхронный эндпоинт FastAPI.
  • FastAPI получает запрос, валидирует его и создает новую задачу Celery.
  • Celery асинхронно помещает задачу в очередь брокера RabbitMQ, работающего “под капотом”.
  • Celery Worker асинхронно получает задание из очереди RabbitMQ, затем отправляет запрос в ML-модель, получает ответ и возвращает результат.
  • Redis используется для кэширования промежуточных результатов и ускорени повторных запросов с теми же данными.
  • Ответ возвращается через RabbitMQ в FastAPI, который отправляет его обратно клиенту в виде HTTP-ответа.

Пример запуска

Ниже приводится лог запуска контейнера. В конце лога можно найти строки, непосредственно демонстрирующие отправку запроса и получение ответа с предсказанием модели.

docker-compose up --build

[+] Running 5/5
 ✔ Network bagel                        Created                                                                                                                                        0.1s
 ✔ Container mlservice-rabbitmq-1       Created                                                                                                                                        0.2s
 ✔ Container mlservice-redis-1          Created                                                                                                                                        0.3s
 ✔ Container mlservice-celery-worker-1  Created                                                                                                                                        0.5s
 ✔ Container mlservice-mlapp-1          Created                                                                                                                                        0.1s
Attaching to celery-worker-1, mlapp-1, rabbitmq-1, redis-1
redis-1          | 1:C 12 Jun 2024 14:33:24.367 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-1          | 1:C 12 Jun 2024 14:33:24.367 # Redis version=7.0.15, bits=64, commit=00000000, modified=0, pid=1, just started
redis-1          | 1:C 12 Jun 2024 14:33:24.367 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis-1          | 1:M 12 Jun 2024 14:33:24.367 * monotonic clock: POSIX clock_gettime
redis-1          | 1:M 12 Jun 2024 14:33:24.370 * Running mode=standalone, port=6379.
redis-1          | 1:M 12 Jun 2024 14:33:24.370 # Server initialized
redis-1          | 1:M 12 Jun 2024 14:33:24.372 * Ready to accept connections
rabbitmq-1       | =INFO REPORT==== 12-Jun-2024::14:33:25.829673 ===
rabbitmq-1       |     alarm_handler: {set,{system_memory_high_watermark,[]}}
rabbitmq-1       | 2024-06-12 14:33:29.980553+00:00 [notice] <0.44.0> Application syslog exited with reason: stopped
rabbitmq-1       | 2024-06-12 14:33:29.990255+00:00 [notice] <0.254.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
rabbitmq-1       | 2024-06-12 14:33:29.991063+00:00 [notice] <0.254.0> Logging: configured log handlers are now ACTIVE
rabbitmq-1       | 2024-06-12 14:33:30.006369+00:00 [info] <0.254.0> ra: starting system quorum_queues
rabbitmq-1       | 2024-06-12 14:33:30.006479+00:00 [info] <0.254.0> starting Ra system: quorum_queues in directory: /var/lib/rabbitmq/mnesia/rabbit@rabbitmq3/quorum/rabbit@rabbitmq3
rabbitmq-1       | 2024-06-12 14:33:30.105111+00:00 [info] <0.268.0> ra system 'quorum_queues' running pre init for 0 registered servers
rabbitmq-1       | 2024-06-12 14:33:30.123278+00:00 [info] <0.269.0> ra: meta data store initialised for system quorum_queues. 0 record(s) recovered
rabbitmq-1       | 2024-06-12 14:33:30.147017+00:00 [notice] <0.274.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
rabbitmq-1       | 2024-06-12 14:33:30.176151+00:00 [info] <0.254.0> ra: starting system coordination
rabbitmq-1       | 2024-06-12 14:33:30.176222+00:00 [info] <0.254.0> starting Ra system: coordination in directory: /var/lib/rabbitmq/mnesia/rabbit@rabbitmq3/coordination/rabbit@rabbitmq3
rabbitmq-1       | 2024-06-12 14:33:30.178912+00:00 [info] <0.282.0> ra system 'coordination' running pre init for 0 registered servers
rabbitmq-1       | 2024-06-12 14:33:30.180710+00:00 [info] <0.283.0> ra: meta data store initialised for system coordination. 0 record(s) recovered
rabbitmq-1       | 2024-06-12 14:33:30.181040+00:00 [notice] <0.288.0> WAL: ra_coordination_log_wal init, open tbls: ra_coordination_log_open_mem_tables, closed tbls: ra_coordination_log_closed_mem_tables
rabbitmq-1       | 2024-06-12 14:33:30.185754+00:00 [info] <0.254.0> ra: starting system coordination
rabbitmq-1       | 2024-06-12 14:33:30.185815+00:00 [info] <0.254.0> starting Ra system: coordination in directory: /var/lib/rabbitmq/mnesia/rabbit@rabbitmq3/coordination/rabbit@rabbitmq3
rabbitmq-1       | 2024-06-12 14:33:30.330057+00:00 [info] <0.254.0> Waiting for Khepri leader for 30000 ms, 9 retries left
rabbitmq-1       | 2024-06-12 14:33:30.336808+00:00 [notice] <0.292.0> RabbitMQ metadata store: candidate -> leader in term: 1 machine version: 0
rabbitmq-1       | 2024-06-12 14:33:30.360140+00:00 [info] <0.254.0> Khepri leader elected
rabbitmq-1       | 2024-06-12 14:33:30.849821+00:00 [info] <0.254.0>
rabbitmq-1       | 2024-06-12 14:33:30.849821+00:00 [info] <0.254.0>  Starting RabbitMQ 3.13.3 on Erlang 26.2.5 [jit]
rabbitmq-1       | 2024-06-12 14:33:30.849821+00:00 [info] <0.254.0>  Copyright (c) 2007-2024 Broadcom Inc and/or its subsidiaries
rabbitmq-1       | 2024-06-12 14:33:30.849821+00:00 [info] <0.254.0>  Licensed under the MPL 2.0. Website: https://rabbitmq.com
rabbitmq-1       |
rabbitmq-1       |   ##  ##      RabbitMQ 3.13.3
rabbitmq-1       |   ##  ##
rabbitmq-1       |   ##########  Copyright (c) 2007-2024 Broadcom Inc and/or its subsidiaries
rabbitmq-1       |   ######  ##
rabbitmq-1       |   ##########  Licensed under the MPL 2.0. Website: https://rabbitmq.com
rabbitmq-1       |
rabbitmq-1       |   Erlang:      26.2.5 [jit]
rabbitmq-1       |   TLS Library: OpenSSL - OpenSSL 3.1.5 30 Jan 2024
rabbitmq-1       |   Release series support status: supported
rabbitmq-1       |
rabbitmq-1       |   Doc guides:  https://www.rabbitmq.com/docs
rabbitmq-1       |   Support:     https://www.rabbitmq.com/docs/contact
rabbitmq-1       |   Tutorials:   https://www.rabbitmq.com/tutorials
rabbitmq-1       |   Monitoring:  https://www.rabbitmq.com/docs/monitoring
rabbitmq-1       |   Upgrading:   https://www.rabbitmq.com/docs/upgrade
rabbitmq-1       |
rabbitmq-1       |   Logs: <stdout>
rabbitmq-1       |
rabbitmq-1       |   Config file(s): /etc/rabbitmq/conf.d/10-defaults.conf
rabbitmq-1       |
rabbitmq-1       |   Starting broker...2024-06-12 14:33:30.852303+00:00 [info] <0.254.0>
rabbitmq-1       | 2024-06-12 14:33:30.852303+00:00 [info] <0.254.0>  node           : rabbit@rabbitmq3
rabbitmq-1       | 2024-06-12 14:33:30.852303+00:00 [info] <0.254.0>  home dir       : /var/lib/rabbitmq
rabbitmq-1       | 2024-06-12 14:33:30.852303+00:00 [info] <0.254.0>  config file(s) : /etc/rabbitmq/conf.d/10-defaults.conf
rabbitmq-1       | 2024-06-12 14:33:30.852303+00:00 [info] <0.254.0>  cookie hash    : NR1KIONsk4av7K4nSVCNyw==
rabbitmq-1       | 2024-06-12 14:33:30.852303+00:00 [info] <0.254.0>  log(s)         : <stdout>
rabbitmq-1       | 2024-06-12 14:33:30.852303+00:00 [info] <0.254.0>  data dir       : /var/lib/rabbitmq/mnesia/rabbit@rabbitmq3
... <too more RabbitMQ> ...
rabbitmq-1       | 2024-06-12 14:33:38.214884+00:00 [info] <0.9.0> Time to start RabbitMQ: 12542 ms
mlapp-1          | INFO:     Started server process [7]
mlapp-1          | INFO:     Waiting for application startup.
mlapp-1          | 2024-06-12 14:33:50.545 | INFO     | src.services.lifespan:lifespan:8 - Service initialized
mlapp-1          | 2024-06-12 14:33:50.545 | INFO     | src.services.lifespan:lifespan:8 - Service initialized
mlapp-1          | INFO:     Application startup complete.
mlapp-1          | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
celery-worker-1  | /root/.cache/pypoetry/virtualenvs/ml_service_with_fastapi,_celery,_redis_and-9TtSrW0h-py3.10/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is
celery-worker-1  | absolutely not recommended!
celery-worker-1  |
celery-worker-1  | Please specify a different user using the --uid option.
celery-worker-1  |
celery-worker-1  | User information: uid=0 euid=0 gid=0 egid=0
celery-worker-1  |
celery-worker-1  |   warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(
celery-worker-1  |
celery-worker-1  |  -------------- celery@82849de377a4 v5.4.0 (opalescent)
celery-worker-1  | --- ***** -----
celery-worker-1  | -- ******* ---- Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.36 2024-06-12 14:34:13
celery-worker-1  | - *** --- * ---
celery-worker-1  | - ** ---------- [config]
celery-worker-1  | - ** ---------- .> app:         tasks:0x7f16af502ec0
celery-worker-1  | - ** ---------- .> transport:   amqp://guest:**@rabbitmq3:5672//
celery-worker-1  | - ** ---------- .> results:     redis://redis:6379/0
celery-worker-1  | - *** --- * --- .> concurrency: 8 (prefork)
celery-worker-1  | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
celery-worker-1  | --- ***** -----
celery-worker-1  |  -------------- [queues]
celery-worker-1  |                 .> celery           exchange=celery(direct) key=celery
celery-worker-1  |
celery-worker-1  |
celery-worker-1  | [tasks]
celery-worker-1  |   . analyze_sentiment
celery-worker-1  |
mlapp-1          | INFO:     172.18.0.1:52978 - "GET / HTTP/1.1" 404 Not Found
mlapp-1          | INFO:     172.18.0.1:52978 - "GET /favicon.ico HTTP/1.1" 404 Not Found
mlapp-1          | INFO:     172.18.0.1:60880 - "GET /docs HTTP/1.1" 200 OK
mlapp-1          | INFO:     172.18.0.1:44998 - "GET /docs HTTP/1.1" 200 OK
mlapp-1          | INFO:     172.18.0.1:41734 - "GET /docs HTTP/1.1" 200 OK
mlapp-1          | INFO:     172.18.0.1:56140 - "GET /docs HTTP/1.1" 200 OK
mlapp-1          | INFO:     172.18.0.1:56140 - "GET /openapi.json HTTP/1.1" 200 OK
rabbitmq-1       | 2024-06-12 14:38:09.818475+00:00 [info] <0.1053.0> accepting AMQP connection <0.1053.0> (172.18.0.5:56902 -> 172.18.0.3:5672)
rabbitmq-1       | 2024-06-12 14:38:09.822566+00:00 [info] <0.1053.0> connection <0.1053.0> (172.18.0.5:56902 -> 172.18.0.3:5672): user 'guest' authenticated and granted access to vhost '/'
celery-worker-1  | 2024-06-12 14:38:09.835 | INFO     | src.celery.worker:analyze_sentiment:12 - Starting prediction task af759dba-6938-417d-ae17-db9fd1ddc31f
celery-worker-1  | 2024-06-12 14:38:11.083 | INFO     | src.celery.worker:analyze_sentiment:17 - Input text: Hey, what's up? What's new?
celery-worker-1  | 2024-06-12 14:38:11.083 | INFO     | src.celery.worker:analyze_sentiment:18 - Predicted: {'label': 'curiosity', 'score': 0.533520519733429}
mlapp-1          | INFO:     172.18.0.1:53042 - "POST /predict/ HTTP/1.1" 200 OK
celery-worker-1  | 2024-06-12 14:38:11.083 | INFO     | src.celery.worker:analyze_sentiment:19 - Completing prediction task af759dba-6938-417d-ae17-db9fd1ddc31f
celery-worker-1  | 2024-06-12 14:39:14.406 | INFO     | src.celery.worker:analyze_sentiment:12 - Starting prediction task acb1aa92-d890-4fff-b951-137f55e089a7
celery-worker-1  | 2024-06-12 14:39:14.413 | INFO     | src.celery.worker:analyze_sentiment:17 - Input text: Damn! I'm hungry!
mlapp-1          | INFO:     172.18.0.1:41306 - "POST /predict/ HTTP/1.1" 200 OK
celery-worker-1  | 2024-06-12 14:39:14.413 | INFO     | src.celery.worker:analyze_sentiment:18 - Predicted: {'label': 'sadness', 'score': 0.35999131202697754}
celery-worker-1  | 2024-06-12 14:39:14.413 | INFO     | src.celery.worker:analyze_sentiment:19 - Completing prediction task acb1aa92-d890-4fff-b951-137f55e089a7