#rabbitmq #rabbitmq-cluster
Вопрос:
Я использую Rabbitmq 3.8.16 с клиентом Rabbitmq Vertx 3.9.4 для своего серверного приложения и заметил, что иногда клиент теряет соединение и не может правильно восстановиться (включено автоматическое восстановление). Это означает, что после восстановления он перестает потреблять сообщения, а в длительных очередях нет потребителей.
Это происходит случайным образом и с различными исключениями. Поскольку я использую клиент Vertx Rabbitmq, я не могу создавать разные каналы для каждого потребителя/издателя, у меня есть что-то вроде 10 издателей и 10 потребителей на одном узле (одно и то же соединение, один и тот же канал), иногда доходящее до 400 сообщений в секунду. Может ли это вызвать такого рода проблемы ??
Другая проблема может заключаться в том, что у меня есть очереди, представляющие подключения websocket к моему бэкэнду, эти подключения websocket недолговечны, поэтому он довольно часто объявляет, а затем удаляет очередь, представляющую этот сеанс пользователя. Это проблема для Rabbitmq ?
Из журнала Rabbitmq я обнаружил следующие исключения, которые были связаны с сбоем соединения:
Первый :
2021-09-08 08:14:15.321 [error] <0.24867.686> ** Generic server <0.24867.686> terminating
** Last message in was {'$gen_cast',{method,{'basic.publish',0,<<"NOTIFICATION_EXCHANGE">>,<<"BROADCAST_NOTIFICATION_ADDRESS">>,false,false},{content,60,none,<<0,0>>,rabbit_framing_amqp_0_9_1,[<<"{MY_MESSAGE}">>]},flow}}
** When Server state == {ch,{conf,running,rabbit_framing_amqp_0_9_1,1,<0.24846.686>,<0.24862.686>,<0.24846.686>,<<"172.20.24.177:60848 -> 172.20.28.73:5672">>,undefined,{user,<<"backend">>,[],[{rabbit_auth_backend_internal,none}]},<<"backend">>,<<"8a4a4f71-251a-4ab6-85f6-50d594db46b9">>,<0.24850.686>,[{<<"exchange_exchange_bindings">>,bool,true},{<<"connection.blocked">>,bool,true},{<<"authentication_failure_close">>,bool,true},{<<"basic.nack">>,bool,true},{<<"publisher_confirms">>,bool,true},{<<"consumer_cancel_notify">>,bool,true}],none,200,134217728,900000,#{},1000000000},{lstate,<0.24877.686>,false},none,4927,{2,{[{4926,<<"amq.ctag-ckMzQHTLitAIZ86qk0_R0Q">>,1631088855286,{{'backend_SERVICE_QUEUE','rabbit@rabbitmq-server-2.rabbitmq-nodes.rabbitmq'},321}}],[{4925,<<"amq.ctag-xDoqkK1VmPptIM9OG3YaPw">>,1631088855286,{{'backend_SERVICE_QUEUE','rabbit@rabbitmq-server-2.rabbitmq-nodes.rabbitmq'},321}}]}},#{'backend_9fb541c5-a9ae-4bbc-99a4-ff444316b3f1' => {resource,<<"backend">>,queue,<<"9fb541c5-a9ae-4bbc-99a4-ff444316b3f1">>},'backend_5f188762-0a66-47e3-bfa8-76aa11481cb9' => {resource,<<"backend">>,queue,<<"5f188762-0a66-47e3-bfa8-76aa11481cb9">>},'backend_e4a3cec7-3f6f-4b05-bdf2-ade8a087425d' => {resource,<<"backend">>,queue,<<"e4a3cec7-3f6f-4b05-bdf2-ade8a087425d">>},'backend_e7a9834a-7ddc-4690-a317-4f19433edafd' => {resource,<<"backend">>,queue,<<"e7a9834a-7ddc-4690-a317-4f19433edafd">>},'backend_6fdcc521-4ff1-4721-bf12-cefffe1181eb' => ...,...},...}
** Reason for termination ==
** {{error,noproc},[{rabbit_fifo_client,enqueue,3,[{file,"src/rabbit_fifo_client.erl"},{line,168}]},{rabbit_quorum_queue,deliver,3,[{file,"src/rabbit_quorum_queue.erl"},{line,754}]},{rabbit_amqqueue,'-deliver/3-fun-3-',4,[{file,"src/rabbit_amqqueue.erl"},{line,2239}]},{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},{rabbit_amqqueue,deliver,3,[{file,"src/rabbit_amqqueue.erl"},{line,2236}]},{rabbit_channel,deliver_to_queues,2,[{file,"src/rabbit_channel.erl"},{line,2204}]},{rabbit_channel,handle_method,3,[{file,"src/rabbit_channel.erl"},{line,1375}]},{rabbit_channel,handle_cast,2,[{file,"src/rabbit_channel.erl"},{line,643}]}]}
2021-09-08 08:14:15.321 [error] <0.24867.686> CRASH REPORT Process <0.24867.686> with 0 neighbours exited with reason: {error,noproc} in rabbit_fifo_client:enqueue/3 line 168 in gen_server2:terminate/3 line 1183
2021-09-08 08:14:15.322 [error] <0.24859.686> Supervisor {<0.24859.686>,rabbit_channel_sup} had child channel started with rabbit_channel:start_link(1, <0.24846.686>, <0.24862.686>, <0.24846.686>, <<"172.20.24.177:60848 -> 172.20.28.73:5672">>, rabbit_framing_amqp_0_9_1, {user,<<"backend">>,[],[{rabbit_auth_backend_internal,none}]}, <<"backend">>, [{<<"exchange_exchange_bindings">>,bool,true},{<<"connection.blocked">>,bool,true},{<<"authentica...">>,...},...], <0.24850.686>, <0.24877.686>) at <0.24867.686> exit with reason {error,noproc} in rabbit_fifo_client:enqueue/3 line 168 in context child_terminated
2021-09-08 08:14:15.322 [error] <0.24859.686> Supervisor {<0.24859.686>,rabbit_channel_sup} had child channel started with rabbit_channel:start_link(1, <0.24846.686>, <0.24862.686>, <0.24846.686>, <<"172.20.24.177:60848 -> 172.20.28.73:5672">>, rabbit_framing_amqp_0_9_1, {user,<<"backend">>,[],[{rabbit_auth_backend_internal,none}]}, <<"backend">>, [{<<"exchange_exchange_bindings">>,bool,true},{<<"connection.blocked">>,bool,true},{<<"authentica...">>,...},...], <0.24850.686>, <0.24877.686>) at <0.24867.686> exit with reason reached_max_restart_intensity in context shutdown
2021-09-08 08:14:15.323 [error] <0.24846.686> Error on AMQP connection <0.24846.686> (172.20.24.177:60848 -> 172.20.28.73:5672, vhost: 'backend', state: running), channel 1:
{{error,noproc},
[{rabbit_fifo_client,enqueue,3,
[{file,"src/rabbit_fifo_client.erl"},{line,168}]},
{rabbit_quorum_queue,deliver,3,
[{file,"src/rabbit_quorum_queue.erl"},{line,754}]},
{rabbit_amqqueue,'-deliver/3-fun-3-',4,
[{file,"src/rabbit_amqqueue.erl"},{line,2239}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},
{rabbit_amqqueue,deliver,3,[{file,"src/rabbit_amqqueue.erl"},{line,2236}]},
{rabbit_channel,deliver_to_queues,2,
[{file,"src/rabbit_channel.erl"},{line,2204}]},
{rabbit_channel,handle_method,3,
[{file,"src/rabbit_channel.erl"},{line,1375}]},
{rabbit_channel,handle_cast,2,
[{file,"src/rabbit_channel.erl"},{line,643}]}]}
2021-09-08 08:14:15.323 [warning] <0.24846.686> Non-AMQP exit reason '{{error,noproc},[{rabbit_fifo_client,enqueue,3,[{file,"src/rabbit_fifo_client.erl"},{line,168}]},{rabbit_quorum_queue,deliver,3,[{file,"src/rabbit_quorum_queue.erl"},{line,754}]},{rabbit_amqqueue,'-deliver/3-fun-3-',4,[{file,"src/rabbit_amqqueue.erl"},{line,2239}]},{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},{rabbit_amqqueue,deliver,3,[{file,"src/rabbit_amqqueue.erl"},{line,2236}]},{rabbit_channel,deliver_to_queues,2,[{file,"src/rabbit_channel.erl"},{line,2204}]},{rabbit_channel,handle_method,3,[{file,"src/rabbit_channel.erl"},{line,1375}]},{rabbit_channel,handle_cast,2,[{file,"src/rabbit_channel.erl"},{line,643}]}]}'
Еще одно, что я заметил, это :
2021-09-02 14:59:43.260 [error] <0.17646.16> closing AMQP connection <0.17646.16> (172.20.23.201:35420 -> 172.20.36.197:5672):
missed heartbeats from client, timeout: 60s
2021-09-02 14:59:43.263 [info] <0.29592.28> Closing all channels from '172.20.23.201:35420 -> 172.20.36.197:5672' because it has been closed
И последнее :
2021-09-08 09:47:14.948 [warning] <0.7610.564> segment_writer: skipping segment as directory /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/quorum/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/ORCHES3OXCNJJBB4VW does not exist
2021-09-08 09:47:14.948 [warning] <0.7611.564> segment_writer: failed to open segment file /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/quorum/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/ORCHES0C5F3P5Y6WKO/00000001.segmenterror: enoent
2021-09-08 09:47:14.948 [warning] <0.199.564> segment_writer: failed to open segment file /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/quorum/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/ORCHES9RZ4RFC37MDG/00000001.segmenterror: enoent
2021-09-08 09:47:14.948 [warning] <0.7611.564> segment_writer: skipping segment as directory /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/quorum/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/ORCHES0C5F3P5Y6WKO does not exist
2021-09-08 09:47:14.948 [warning] <0.199.564> segment_writer: skipping segment as directory /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/quorum/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/ORCHES9RZ4RFC37MDG does not exist
2021-09-08 09:47:14.950 [warning] <0.7600.564> segment_writer: failed to open segment file /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/quorum/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/ORCHESAQL00HNNSPK4/00000001.segmenterror: enoent
2021-09-08 09:47:14.950 [warning] <0.7600.564> segment_writer: skipping segment as directory /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/quorum/rabbit@rabbitmq-server-1.rabbitmq-nodes.rabbitmq/ORCHESAQL00HNNSPK4 does not exist
На стороне клиента я заметил :
15:11:30.816 [vert.x-eventloop-thread-0]Fail to handle message
com.rabbitmq.client.AlreadyClosedException: connection is already closed due to connection error; cause: java.lang.IllegalStateException: Unsolicited delivery - see Channel.setDefaultConsumer to handle this case.
at com.rabbitmq.client.impl.AMQConnection.ensureIsOpen(AMQConnection.java:172)
at com.rabbitmq.client.impl.AMQConnection.createChannel(AMQConnection.java:550)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.createChannel(AutorecoveringConnection.java:165)
at io.vertx.rabbitmq.impl.RabbitMQClientImpl.forChannel(RabbitMQClientImpl.java:476)
at io.vertx.rabbitmq.impl.RabbitMQClientImpl.basicPublish(RabbitMQClientImpl.java:194)
at io.vertx.reactivex.rabbitmq.RabbitMQClient.basicPublish(RabbitMQClient.java:363)
at io.vertx.reactivex.rabbitmq.RabbitMQClient.lambda$rxBasicPublish$9(RabbitMQClient.java:376)
at io.vertx.reactivex.rabbitmq.RabbitMQClient$Lambda$699/0x00000000102c38c0.accept(Unknown Source)
at io.vertx.reactivex.impl.AsyncResultCompletable.subscribeActual(AsyncResultCompletable.java:44)
or
08:14:15.330 [AMQP Connection 10.100.56.15:5672] WARN com.rabbitmq.client.impl.ForgivingExceptionHandler - An unexpected connection driver error occured (Exception message: Connection reset)
08:14:15.332 [AMQP Connection 10.100.56.15:5672] INFO io.vertx.rabbitmq.impl.RabbitMQClientImpl - RabbitMQ connection shutdown! The client will attempt to reconnect automatically
com.rabbitmq.client.ShutdownSignalException: connection error; protocol method: #method<connection.close>(reply-code=541, reply-text=INTERNAL_ERROR, class-id=0, method-id=0)
at com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:916)
at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:906)
at com.rabbitmq.client.impl.AMQConnection.handleConnectionClose(AMQConnection.java:844)
at com.rabbitmq.client.impl.AMQConnection.processControlCommand(AMQConnection.java:799)
at com.rabbitmq.client.impl.AMQConnection$1.processAsync(AMQConnection.java:242)
at com.rabbitmq.client.impl.AMQChannel.handleCompleteInboundCommand(AMQChannel.java:178)
at com.rabbitmq.client.impl.AMQChannel.handleFrame(AMQChannel.java:111)
at com.rabbitmq.client.impl.AMQConnection.readFrame(AMQConnection.java:650)
at com.rabbitmq.client.impl.AMQConnection.access$300(AMQConnection.java:48)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:597)
Since those errors are not clear to my, does anybody has an idea why is this happening ??
Thank you for your help