Я настроил emqtt на 1 МЛН клиентов, но он выходит из строя после 500 клиентов

#emq

#emq

Вопрос:

Я пытаюсь настроить EMQTT для 1 Млн клиентов, но это не удается после 500 клиентов. Я проверяю через Jmeter с небольшим сообщением размером всего 1 КБ. Тестовый пример рассчитан на 1000 клиентов, каждый из которых отправляет 1 сообщение в секунду.

Объем оперативной памяти составляет 16 ГБ

Ниже приведена спецификация процессора:

 Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                3
On-line CPU(s) list:   0-2
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             3
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
Stepping:              1
CPU MHz:               2097.570
BogoMIPS:              4195.14
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
  

Я настроил emqttd согласно этой ссылке:https://emqx-enterprise-docs-en.readthedocs.io/en/latest/tune.html

Ниже приведен журнал ошибок:

 2019-04-04 16:31:28.416 [error] <0.1485.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.416 [error] <0.1534.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.416 [error] <0.1490.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.416 [error] <0.1516.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.416 [error] <0.1506.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.418 [error] <0.1531.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.418 [error] <0.1510.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.418 [error] <0.1495.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:28.418 [error] <0.1536.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17300 emfile errors!!!
2019-04-04 16:31:38.483 [error] <0.1487.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.483 [error] <0.1496.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.483 [error] <0.1521.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.484 [error] <0.1519.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.484 [error] <0.1497.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.493 [error] <0.1501.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.493 [error] <0.1502.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.493 [error] <0.1481.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.493 [error] <0.1513.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.494 [error] <0.1514.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.497 [error] <0.1488.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.499 [error] <0.1524.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.499 [error] <0.1500.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.499 [error] <0.1518.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.499 [error] <0.1480.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1482.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1491.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1507.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1477.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1532.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1499.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1530.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.500 [error] <0.1504.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.501 [error] <0.1505.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.501 [error] <0.1528.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.501 [error] <0.1526.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.502 [error] <0.1512.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.502 [error] <0.1515.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.504 [error] <0.1489.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.504 [error] <0.1527.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.504 [error] <0.1517.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.504 [error] <0.1522.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.504 [error] <0.1498.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.504 [error] <0.1483.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.507 [error] <0.1503.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.507 [error] <0.1486.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.507 [error] <0.1493.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.507 [error] <0.1494.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.507 [error] <0.1475.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.507 [error] <0.1533.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.508 [error] <0.1484.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.508 [error] <0.1476.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.508 [error] <0.1529.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.508 [error] <0.1535.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.508 [error] <0.1511.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.508 [error] <0.1520.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.508 [error] <0.1537.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.509 [error] <0.1523.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.509 [error] <0.1478.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.513 [error] <0.1479.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.513 [error] <0.1509.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.513 [error] <0.1525.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.513 [error] <0.1508.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.513 [error] <0.1538.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.516 [error] <0.1492.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.516 [error] <0.1485.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.516 [error] <0.1534.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.516 [error] <0.1490.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.516 [error] <0.1516.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.516 [error] <0.1506.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.518 [error] <0.1531.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.518 [error] <0.1495.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.518 [error] <0.1510.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
2019-04-04 16:31:38.518 [error] <0.1536.0> [error] acceptor on 0.0.0.0:1883 suspend 100(ms) for 17400 emfile errors!!!
  

Ответ №1:

Вы должны проверить «максимальное количество открытых файлов» вашего процесса emqx с помощью cat /proc/<pid-of-emqx>/limits . Если ограничение слишком мало, снова выполните руководство по настройке и перезапустите emqx.

Добавьте LimitNOFILE и LimitNPROC в <servicename>.service файл, если вы используете systemd:

 [Service]
LimitNOFILE=1024000
LimitNPROC=1024000
  

Смотрите https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process Properties подробнее.