Airflow PythonVirtualOperator не устанавливает пакеты pip

#python #pip #airflow #virtualenv

Вопрос:

Я пытаюсь использовать PythonVirtualOperator в Airflow 2.2.0, который работает на Kubernetes. Проблема в том, что задача завершается неудачно, и, глядя на журналы, я вижу, что она не может установить пакеты, даже такие простые, как numpy.

Мой код:

 with DAG(
    dag_id='example_python_operator',
    schedule_interval=None,
    start_date=datetime(2021, 1, 1),
    catchup=False,
    tags=['example'],
) as dag:

    def callable_virtualenv():
        print('Finished')

    virtualenv_task = PythonVirtualenvOperator(
        task_id="virtualenv_python",
        python_callable=callable_virtualenv,
        requirements=["numpy==1.21.4"],
        system_site_packages=False,
    )
 

Журналы показывают, что pip не смог установить пакеты. Мы используем http_proxy в наших k8s, но я проверил с помощью оператора bash, интернет, кажется, хорошо работает в операторах.

Журналы:

 [2021-11-10, 11:05:01 UTC] {taskinstance.py:1412} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=***
AIRFLOW_CTX_DAG_ID=example_python_operator
AIRFLOW_CTX_TASK_ID=virtualenv_python
AIRFLOW_CTX_EXECUTION_DATE=2021-11-10T11:05:00.288175 00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-11-10T11:05:00.288175 00:00
[2021-11-10, 11:05:01 UTC] {process_utils.py:135} INFO - Executing cmd: /usr/local/bin/python -m virtualenv /tmp/venvosz7c8zz
[2021-11-10, 11:05:01 UTC] {process_utils.py:139} INFO - Output:
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO - created virtual environment CPython3.9.7.final.0-64 in 248ms
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO -   creator CPython3Posix(dest=/tmp/venvosz7c8zz, clear=False, no_vcs_ignore=False, global=False)
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO -   seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/***/.local/share/virtualenv)
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO -     added seed packages: pip==21.2.4, setuptools==58.2.0, wheel==0.37.0
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO -   activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
[2021-11-10, 11:05:02 UTC] {process_utils.py:135} INFO - Executing cmd: /tmp/venvosz7c8zz/bin/pip install numpy==1.21.4 lazy-object-proxy
[2021-11-10, 11:05:02 UTC] {process_utils.py:139} INFO - Output:
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO - ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO - WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
[2021-11-10, 11:05:02 UTC] {process_utils.py:143} INFO - You should consider upgrading via the '/tmp/venvosz7c8zz/bin/python -m pip install --upgrade pip' command.
[2021-11-10, 11:05:02 UTC] {taskinstance.py:1686} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1324, in _run_raw_task
    self._execute_task_with_callbacks(context)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1443, in _execute_task_with_callbacks
    result = self._execute_task(context, self.task)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1499, in _execute_task
    result = execute_callable(context=context)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 365, in execute
    return super().execute(context=serializable_context)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 151, in execute
    return_value = self.execute_callable()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 377, in execute_callable
    prepare_virtualenv(
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/python_virtualenv.py", line 99, in prepare_virtualenv
    execute_in_subprocess(pip_cmd)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/process_utils.py", line 147, in execute_in_subprocess
    raise subprocess.CalledProcessError(exit_code, cmd)
subprocess.CalledProcessError: Command '['/tmp/venvosz7c8zz/bin/pip', 'install', 'numpy==1.21.4', 'lazy-object-proxy']' returned non-zero exit status 1.
 

Может ли кто-нибудь подсказать, как это исправить, в чем, по-видимому, проблема?