Потребляемая программой память продолжает расти при использовании PyCurl в многопоточной среде

#python #memory-leaks #pycurl

#python #утечки памяти #pycurl

Вопрос:

Потребляемая программой Python память растет со временем, когда PyCurl используется в многопоточной среде для одновременных вызовов HTTP API.

Прикрепленный фрагмент кода Python демонстрирует проблему. Программа просто нажимает URL-адреса в нескольких потоках, используя PyCurl, и регистрирует использование памяти после каждой итерации.

Код

 '''
Run pycurl in multi-threaded environment and log memory usage.
'''

import os
import io
import sys
import time
import logging
import itertools
from multiprocessing.pool import ThreadPool

import pycurl
import psutil

# Number of threads for concurrent HTTP calls
THREAD_COUNT = 100

# URLs for testing
URLS = [x.strip() for x in open('urls.txt') if x.strip()]
URLS = itertools.cycle(URLS)

# Setup logger
LOGGER = logging.getLogger('TestPyCURL')
CONSOLE_HANDLER = logging.StreamHandler(sys.stdout)
LOG_FORMAT = '%(asctime)s - %(threadName)s - %(message)s'
FORMATTER = logging.Formatter(LOG_FORMAT, '%b %d %I:%M:%S')
CONSOLE_HANDLER.setFormatter(FORMATTER)
CONSOLE_HANDLER.setLevel(logging.DEBUG)
LOGGER.addHandler(CONSOLE_HANDLER)
LOGGER.setLevel(logging.DEBUG)


# Note current memory usage of program
process = psutil.Process(os.getpid())
last_usage = process.memory_info().rss
def log_memory_usage():
    '''
    Log change in memory usage of process.
    '''
    global last_usage
    usage = process.memory_info().rss
    change = usage - last_usage
    change_prcnt = change / last_usage * 100
    if usage == last_usage:
        slope = 'No change'
    elif usage > last_usage:
        slope = 'Increased'
    else:
        slope = 'Decreased'
    last_usage = usage
    LOGGER.info('Memory usage: Total: %s MB Change: %s, %s KB, %.3f %%',
                usage / 1000000, slope, change / 1000, change_prcnt)


def curl_request(url):
    '''
    Do HTTP request using Curl.
    '''
    buffer = io.BytesIO()
    curl = pycurl.Curl()
    curl.setopt(curl.URL, url)
    curl.setopt(curl.TIMEOUT, 10)
    curl.setopt(curl.CONNECTTIMEOUT, 5)
    curl.setopt(curl.WRITEDATA, buffer)
    try:
        curl.perform()
    except pycurl.error as exce:
        status = str(exce)
    else:
        status_code = curl.getinfo(curl.RESPONSE_CODE)
        status = status_code, len(buffer.getvalue())
    LOGGER.info('%s, %s', url, status)
    curl.close()
    buffer.close()


def main():
    '''
    Test pycurl.
    '''
    # Create pool of worker threads
    pool = ThreadPool(THREAD_COUNT)
    # Run forever
    while True:
        # Log change in memory usage of program
        log_memory_usage()
        # Sleep
        time.sleep(5)
        # Launch parallel HTTP requests
        urls = [next(URLS) for _ in range(0, THREAD_COUNT)]
        pool.map(curl_request, urls)


if __name__ == '__main__':
    try:
        main()
    except KeyboardInterrupt:
        pass

 

Тестовая среда

  • ОС: Ubuntu 20.04
  • Python: 3.8.10
  • PyCurl и LibCurl: PycURL/7.44.1 libcurl/7.68.0 OpenSSL/1.1.1f zlib/1.2.11 brotli/1.0.7 libidn2/2.2.0 libpsl/0.21.0 ( libidn2 / 2.2.0) libssh/0.9.3/openssl/zlib nghttp2/1.40.0 librtmp/2.3

Использование памяти с течением времени регистрируется тестовой программой

 Dec 17 05:30:02 - MainThread - Memory usage: Total: 32.17408 MB Change: Increased, 14008.32 KB, 77.114 %
Dec 17 05:30:17 - MainThread - Memory usage: Total: 34.28352 MB Change: Increased, 2109.44 KB, 6.556 %
Dec 17 05:30:33 - MainThread - Memory usage: Total: 104.075264 MB Change: Increased, 69791.744 KB, 203.572 %
Dec 17 05:30:43 - MainThread - Memory usage: Total: 124.370944 MB Change: Increased, 20295.68 KB, 19.501 %
Dec 17 05:30:53 - MainThread - Memory usage: Total: 130.592768 MB Change: Increased, 6221.824 KB, 5.003 %
Dec 17 05:31:08 - MainThread - Memory usage: Total: 137.158656 MB Change: Increased, 6565.888 KB, 5.028 %
Dec 17 05:31:18 - MainThread - Memory usage: Total: 139.640832 MB Change: Increased, 2482.176 KB, 1.810 %
Dec 17 05:31:25 - MainThread - Memory usage: Total: 143.33952 MB Change: Increased, 3698.688 KB, 2.649 %
Dec 17 05:31:40 - MainThread - Memory usage: Total: 145.32608 MB Change: Increased, 1986.56 KB, 1.386 %
Dec 17 05:31:51 - MainThread - Memory usage: Total: 149.516288 MB Change: Increased, 4190.208 KB, 2.883 %
....
Dec 17 05:54:17 - MainThread - Memory usage: Total: 261.025792 MB Change: Decreased, -81.92 KB, -0.031 %
Dec 17 05:54:27 - MainThread - Memory usage: Total: 261.050368 MB Change: Increased, 24.576 KB, 0.009 %
Dec 17 05:54:39 - MainThread - Memory usage: Total: 261.087232 MB Change: Increased, 36.864 KB, 0.014 %
Dec 17 05:54:54 - MainThread - Memory usage: Total: 261.935104 MB Change: Increased, 847.872 KB, 0.325 %
Dec 17 05:55:10 - MainThread - Memory usage: Total: 261.869568 MB Change: Decreased, -65.536 KB, -0.025 %
Dec 17 05:55:25 - MainThread - Memory usage: Total: 261.55008 MB Change: Decreased, -319.488 KB, -0.122 %
Dec 17 05:55:40 - MainThread - Memory usage: Total: 261.844992 MB Change: Increased, 294.912 KB, 0.113 %
Dec 17 05:55:50 - MainThread - Memory usage: Total: 261.853184 MB Change: Increased, 8.192 KB, 0.003 %
Dec 17 05:56:00 - MainThread - Memory usage: Total: 261.709824 MB Change: Decreased, -143.36 KB, -0.055 %
Dec 17 05:56:15 - MainThread - Memory usage: Total: 261.91872 MB Change: Increased, 208.896 KB, 0.080 %
....
Dec 17 06:18:22 - MainThread - Memory usage: Total: 315.809792 MB Change: Increased, 8.192 KB, 0.003 %
Dec 17 06:18:34 - MainThread - Memory usage: Total: 315.850752 MB Change: Increased, 40.96 KB, 0.013 %
Dec 17 06:18:44 - MainThread - Memory usage: Total: 315.711488 MB Change: Decreased, -139.264 KB, -0.044 %
Dec 17 06:19:00 - MainThread - Memory usage: Total: 315.67872 MB Change: Decreased, -32.768 KB, -0.010 %
Dec 17 06:19:13 - MainThread - Memory usage: Total: 315.826176 MB Change: Increased, 147.456 KB, 0.047 %
Dec 17 06:19:19 - MainThread - Memory usage: Total: 315.744256 MB Change: Decreased, -81.92 KB, -0.026 %
Dec 17 06:19:25 - MainThread - Memory usage: Total: 315.793408 MB Change: Increased, 49.152 KB, 0.016 %
Dec 17 06:19:41 - MainThread - Memory usage: Total: 323.272704 MB Change: Increased, 7479.296 KB, 2.368 %
Dec 17 06:19:51 - MainThread - Memory usage: Total: 323.416064 MB Change: Increased, 143.36 KB, 0.044 %
Dec 17 06:20:06 - MainThread - Memory usage: Total: 323.293184 MB Change: Decreased, -122.88 KB, -0.038 %
....
Dec 17 06:43:19 - MainThread - Memory usage: Total: 344.764416 MB Change: Increased, 20.48 KB, 0.006 %
Dec 17 06:43:30 - MainThread - Memory usage: Total: 344.707072 MB Change: Decreased, -57.344 KB, -0.017 %
Dec 17 06:43:45 - MainThread - Memory usage: Total: 345.001984 MB Change: Increased, 294.912 KB, 0.086 %
Dec 17 06:43:55 - MainThread - Memory usage: Total: 344.870912 MB Change: Decreased, -131.072 KB, -0.038 %
Dec 17 06:44:05 - MainThread - Memory usage: Total: 344.887296 MB Change: Increased, 16.384 KB, 0.005 %
Dec 17 06:44:18 - MainThread - Memory usage: Total: 344.94464 MB Change: Increased, 57.344 KB, 0.017 %
Dec 17 06:44:28 - MainThread - Memory usage: Total: 345.018368 MB Change: Increased, 73.728 KB, 0.021 %
Dec 17 06:44:39 - MainThread - Memory usage: Total: 344.977408 MB Change: Decreased, -40.96 KB, -0.012 %
Dec 17 06:44:54 - MainThread - Memory usage: Total: 344.928256 MB Change: Decreased, -49.152 KB, -0.014 %
Dec 17 06:45:04 - MainThread - Memory usage: Total: 344.879104 MB Change: Decreased, -49.152 KB, -0.014 %
....
Dec 17 07:08:09 - MainThread - Memory usage: Total: 364.01152 MB Change: Increased, 3289.088 KB, 0.912 %
Dec 17 07:08:24 - MainThread - Memory usage: Total: 364.044288 MB Change: Increased, 32.768 KB, 0.009 %
Dec 17 07:08:39 - MainThread - Memory usage: Total: 364.191744 MB Change: Increased, 147.456 KB, 0.041 %
Dec 17 07:08:54 - MainThread - Memory usage: Total: 364.232704 MB Change: Increased, 40.96 KB, 0.011 %
Dec 17 07:09:05 - MainThread - Memory usage: Total: 364.208128 MB Change: Decreased, -24.576 KB, -0.007 %
Dec 17 07:09:19 - MainThread - Memory usage: Total: 364.249088 MB Change: Increased, 40.96 KB, 0.011 %
Dec 17 07:09:29 - MainThread - Memory usage: Total: 364.191744 MB Change: Decreased, -57.344 KB, -0.016 %
Dec 17 07:09:44 - MainThread - Memory usage: Total: 364.158976 MB Change: Decreased, -32.768 KB, -0.009 %
Dec 17 07:09:54 - MainThread - Memory usage: Total: 364.126208 MB Change: Decreased, -32.768 KB, -0.009 %
Dec 17 07:10:10 - MainThread - Memory usage: Total: 364.101632 MB Change: Decreased, -24.576 KB, -0.007 %
....
Dec 17 07:33:08 - MainThread - Memory usage: Total: 372.379648 MB Change: Increased, 49.152 KB, 0.013 %
Dec 17 07:33:23 - MainThread - Memory usage: Total: 372.477952 MB Change: Increased, 98.304 KB, 0.026 %
Dec 17 07:33:30 - MainThread - Memory usage: Total: 372.46976 MB Change: Decreased, -8.192 KB, -0.002 %
Dec 17 07:33:36 - MainThread - Memory usage: Total: 372.453376 MB Change: Decreased, -16.384 KB, -0.004 %
Dec 17 07:33:51 - MainThread - Memory usage: Total: 372.330496 MB Change: Decreased, -122.88 KB, -0.033 %
Dec 17 07:34:07 - MainThread - Memory usage: Total: 372.30592 MB Change: Decreased, -24.576 KB, -0.007 %
Dec 17 07:34:22 - MainThread - Memory usage: Total: 372.256768 MB Change: Decreased, -49.152 KB, -0.013 %
Dec 17 07:34:37 - MainThread - Memory usage: Total: 372.215808 MB Change: Decreased, -40.96 KB, -0.011 %
Dec 17 07:34:47 - MainThread - Memory usage: Total: 372.269056 MB Change: Increased, 53.248 KB, 0.014 %
Dec 17 07:34:59 - MainThread - Memory usage: Total: 372.310016 MB Change: Increased, 40.96 KB, 0.011 %
....
Dec 17 07:56:53 - MainThread - Memory usage: Total: 372.55168 MB Change: Decreased, -8.192 KB, -0.002 %
Dec 17 07:57:03 - MainThread - Memory usage: Total: 372.617216 MB Change: Increased, 65.536 KB, 0.018 %
Dec 17 07:57:13 - MainThread - Memory usage: Total: 372.453376 MB Change: Decreased, -163.84 KB, -0.044 %
Dec 17 07:57:21 - MainThread - Memory usage: Total: 372.59264 MB Change: Increased, 139.264 KB, 0.037 %
Dec 17 07:57:28 - MainThread - Memory usage: Total: 372.617216 MB Change: Increased, 24.576 KB, 0.007 %
Dec 17 07:57:44 - MainThread - Memory usage: Total: 372.584448 MB Change: Decreased, -32.768 KB, -0.009 %
Dec 17 07:57:54 - MainThread - Memory usage: Total: 372.46976 MB Change: Decreased, -114.688 KB, -0.031 %
Dec 17 07:58:00 - MainThread - Memory usage: Total: 372.461568 MB Change: Decreased, -8.192 KB, -0.002 %
Dec 17 07:58:07 - MainThread - Memory usage: Total: 372.424704 MB Change: Decreased, -36.864 KB, -0.010 %
Dec 17 07:58:22 - MainThread - Memory usage: Total: 372.359168 MB Change: Decreased, -65.536 KB, -0.018 %
....
Dec 17 08:21:03 - MainThread - Memory usage: Total: 372.699136 MB Change: Decreased, -8.192 KB, -0.002 %
Dec 17 08:21:14 - MainThread - Memory usage: Total: 372.764672 MB Change: Increased, 65.536 KB, 0.018 %
Dec 17 08:21:25 - MainThread - Memory usage: Total: 372.690944 MB Change: Decreased, -73.728 KB, -0.020 %
Dec 17 08:21:35 - MainThread - Memory usage: Total: 372.994048 MB Change: Increased, 303.104 KB, 0.081 %
Dec 17 08:21:50 - MainThread - Memory usage: Total: 372.87936 MB Change: Decreased, -114.688 KB, -0.031 %
Dec 17 08:22:00 - MainThread - Memory usage: Total: 372.805632 MB Change: Decreased, -73.728 KB, -0.020 %
Dec 17 08:22:15 - MainThread - Memory usage: Total: 372.79744 MB Change: Decreased, -8.192 KB, -0.002 %
Dec 17 08:22:29 - MainThread - Memory usage: Total: 372.764672 MB Change: Decreased, -32.768 KB, -0.009 %
Dec 17 08:22:37 - MainThread - Memory usage: Total: 372.79744 MB Change: Increased, 32.768 KB, 0.009 %
Dec 17 08:22:45 - MainThread - Memory usage: Total: 372.690944 MB Change: Decreased, -106.496 KB, -0.029 %
....
Dec 17 08:44:46 - MainThread - Memory usage: Total: 381.915136 MB Change: Increased, 40.96 KB, 0.011 %
Dec 17 08:44:58 - MainThread - Memory usage: Total: 387.9936 MB Change: Increased, 6078.464 KB, 1.592 %
Dec 17 08:45:09 - MainThread - Memory usage: Total: 387.981312 MB Change: Decreased, -12.288 KB, -0.003 %
Dec 17 08:45:25 - MainThread - Memory usage: Total: 387.964928 MB Change: Decreased, -16.384 KB, -0.004 %
Dec 17 08:45:40 - MainThread - Memory usage: Total: 388.284416 MB Change: Increased, 319.488 KB, 0.082 %
Dec 17 08:45:55 - MainThread - Memory usage: Total: 388.440064 MB Change: Increased, 155.648 KB, 0.040 %
Dec 17 08:46:06 - MainThread - Memory usage: Total: 388.407296 MB Change: Decreased, -32.768 KB, -0.008 %
Dec 17 08:46:14 - MainThread - Memory usage: Total: 388.358144 MB Change: Decreased, -49.152 KB, -0.013 %
Dec 17 08:46:24 - MainThread - Memory usage: Total: 388.325376 MB Change: Decreased, -32.768 KB, -0.008 %
Dec 17 08:46:39 - MainThread - Memory usage: Total: 388.3008 MB Change: Decreased, -24.576 KB, -0.006 %
....
Dec 17 09:08:33 - MainThread - Memory usage: Total: 405.762048 MB Change: Decreased, -8.192 KB, -0.002 %
Dec 17 09:08:41 - MainThread - Memory usage: Total: 405.753856 MB Change: Decreased, -8.192 KB, -0.002 %
Dec 17 09:08:48 - MainThread - Memory usage: Total: 405.938176 MB Change: Increased, 184.32 KB, 0.045 %
Dec 17 09:09:03 - MainThread - Memory usage: Total: 405.880832 MB Change: Decreased, -57.344 KB, -0.014 %
Dec 17 09:09:09 - MainThread - Memory usage: Total: 406.011904 MB Change: Increased, 131.072 KB, 0.032 %
Dec 17 09:09:15 - MainThread - Memory usage: Total: 405.970944 MB Change: Decreased, -40.96 KB, -0.010 %
Dec 17 09:09:27 - MainThread - Memory usage: Total: 408.100864 MB Change: Increased, 2129.92 KB, 0.525 %
Dec 17 09:09:38 - MainThread - Memory usage: Total: 408.010752 MB Change: Decreased, -90.112 KB, -0.022 %
Dec 17 09:09:53 - MainThread - Memory usage: Total: 408.223744 MB Change: Increased, 212.992 KB, 0.052 %
Dec 17 09:10:08 - MainThread - Memory usage: Total: 408.109056 MB Change: Decreased, -114.688 KB, -0.028 %
....
Dec 17 09:31:34 - MainThread - Memory usage: Total: 426.8032 MB Change: Increased, 114.688 KB, 0.027 %
Dec 17 09:31:50 - MainThread - Memory usage: Total: 426.76224 MB Change: Decreased, -40.96 KB, -0.010 %
Dec 17 09:31:59 - MainThread - Memory usage: Total: 426.778624 MB Change: Increased, 16.384 KB, 0.004 %
 

Может кто-нибудь подсказать, как правильно очистить память, чтобы избежать этой утечки памяти?

Комментарии:

1. Сообщается в репозитории PyCurl на GitHub github.com/pycurl/pycurl/issues/721

2. Вы действительно запускали свой процесс достаточно долго, чтобы быть уверенным, что он действительно протекает? Использование RSS в качестве показателя размера процесса — не очень хорошая идея, потому что процесс может быть довольно большим, не имея большого количества резидентных страниц, а набор резидентных страниц может уменьшиться без фактического сокращения процесса, когда, например, другие более активные процессы приводят к замене этих страниц. Попробуйте запустить свой процесс за одну ночь и посмотрите, используя VSS, который все еще несовершенен, но является лучшим показателем, чем RSS, продолжает ли ваш процесс расти.

3. Здесь прилагается тестовый код для воспроизведения поведения. Фактическое приложение — это постоянно работающая служба systemd, приложение вылетает через 5-7 дней из-за чрезмерного использования памяти. Потребление памяти подтверждено htop.

4. Я бы заподозрил, что если вы посмотрите VSZ своего тестового кода, он стабилизируется максимум через несколько часов и что ваш тестовый код не будет аварийно завершаться. Как долго вы когда-либо запускали свой тестовый код и не зависал ли он за это время? Кроме того, ваше фактическое приложение создает дамп ядра при сбое?

5. Не тестовый код, а фактическое приложение, использующее pycurl, является сервисом systemd. Я заметил, что потребление памяти приложением растет со временем, и в конечном итоге (почти через 5-7 дней) оно уничтожается ОС из-за чрезмерного использования оперативной памяти. Пожалуйста, посмотрите обсуждение здесь github.com/pycurl/pycurl/issues/721 .

Ответ №1:

Отвечая на мой собственный вопрос.

Пожалуйста, посмотрите обсуждение здесь https://github.com/pycurl/pycurl/issues/721

По-видимому, PyCurl плохо работает с многопоточностью Python, когда количество потоков достаточно велико, но отлично работает в одном потоке. Для сценариев, требующих одновременных вызовов API, рекомендуется использовать pycurl.CurlMulti

Рекомендация: https://github.com/pycurl/pycurl/issues/721#issuecomment-992632719

Тестовый код: https://github.com/pycurl/pycurl/issues/721#issuecomment-994475703