#amazon-web-services #amazon-ec2 #crash-reports #amazon-linux-2
#amazon-web-services #amazon-ec2 #отчеты о сбоях #amazon-linux-2
Вопрос:
Вчера мы потеряли связь с 10 одинаково настроенными серверами, после некоторого расследования был сделан вывод о том, что перезагрузка после обновлений безопасности не удалась.
До сих пор нам не удалось вернуть ни один из серверов в оперативный режим, но нам посчастливилось переустановить экземпляры без потери данных.
Я вставлю журнал консоли ниже, кто-нибудь может помочь мне определить основную причину и, возможно, дать мне несколько советов о том, есть ли лучший способ настроить сервер, чтобы упростить восстановление (например, пройти мимо подсказки «Нажмите Enter, чтобы продолжить», которая, похоже, зависает).
Полный журнал слишком велик для SO, поэтому я поместил его в pastebin и вставил отредактированную версию ниже. Я удалил escape-последовательности, которые окрашивают выходные данные, и удалил некоторые двойные новые строки, но, кроме того, он завершен.
[ 0.000000] Linux version 4.14.200-155.322.amzn2.x86_64 (mockbuild@ip-10-0-1-230) (gcc version 7.3.1 20180712 (Red Hat 7.3.1-10) (GCC)) #1 SMP Thu Oct 15 20:11:12 UTC 2020
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.14.200-155.322.amzn2.x86_64 root=UUID=a1e1011e-e38f-408e-878b-fed395b47ad6 ro console=tty0 console=ttyS0,115200n8 net.ifnames=0 biosdevname=0 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 LANG=en_US.UTF-8
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.7 present.
[ 0.000000] DMI: Amazon EC2 t3.micro/, BIOS 1.0 10/16/2017
[ 0.000000] Hypervisor detected: KVM
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] e820: last_pfn = 0x3e3fa max_arch_pfn = 0x400000000
[ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.000000] Scanning 1 areas for low memory corruption
[ 0.000000] Using GB pages for direct mapping
[ 0.000000] RAMDISK: [mem 0x3433e000-0x36196fff]
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x00000000000F8F80 000014 (v00 AMAZON)
[ 0.000000] ACPI: RSDT 0x000000003E3FE360 00003C (v01 AMAZON AMZNRSDT 00000001 AMZN 00000001)
[ 0.000000] ACPI: FACS 0x000000003E3FFF40 000040
[ 0.000000] ACPI: SSDT 0x000000003E3FF6C0 00087A (v01 AMAZON AMZNSSDT 00000001 AMZN 00000001)
[ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
[ 0.000000] e820: [mem 0x40000000-0xdfffffff] available for PCI devices
[ 0.000000] Booting paravirtualized kernel on KVM
[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.14.200-155.322.amzn2.x86_64 root=UUID=a1e1011e-e38f-408e-878b-fed395b47ad6 ro console=tty0 console=ttyS0,115200n8 net.ifnames=0 biosdevname=0 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 LANG=en_US.UTF-8
[ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.000000] Memory: 943540K/1019488K available (10252K kernel code, 1958K rwdata, 2780K rodata, 2088K init, 4240K bss, 75948K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[ 0.000000] Kernel/User page tables isolation: enabled
[ 0.000000] ftrace: allocating 26683 entries in 105 pages
[ 0.004000] Hierarchical RCU implementation.
[ 0.004000] RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=2.
[ 0.004000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
[ 0.004000] NR_IRQS: 524544, nr_irqs: 440, preallocated irqs: 16
[ 0.004000] Console: colour VGA 80x25
[ 0.004000] console [tty0] enabled
[ 0.004000] console [ttyS0] enabled
[ 0.004005] tsc: Detected 2500.000 MHz processor
[ 0.007582] Calibrating delay loop (skipped) preset value.. 5000.00 BogoMIPS (lpj=10000000)
[ 0.008002] pid_max: default: 32768 minimum: 301
[ 0.012006] ACPI: Core revision 20170728
[ 0.016560] ACPI: 2 ACPI AML tables successfully acquired and loaded
[ 0.020015] Security Framework initialized
[ 0.024002] SELinux: Initializing.
[ 0.028159] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[ 0.032082] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 0.036012] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes)
[ 0.040006] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes)
[ 0.044325] Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
[ 0.048003] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
[ 0.052003] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[ 0.056003] Spectre V2 : Mitigation: Full generic retpoline
[ 0.060002] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[ 0.064002] Speculative Store Bypass: Vulnerable
[ 0.067720] TAA: Vulnerable: Clear CPU buffers attempted, no microcode
[ 0.068002] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[ 0.072086] Freeing SMP alternatives memory: 24K
[ 0.076807] smpboot: Max logical packages: 1
[ 0.080264] x2apic enabled
[ 0.084003] Switched APIC routing to physical x2apic.
[ 0.088000] ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
[ 0.088000] smpboot: CPU0: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz (family: 0x6, model: 0x55, stepping: 0x4)
[ 0.088074] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.
[ 0.092046] Hierarchical SRCU implementation.
[ 0.095857] NMI watchdog: Perf event create on CPU 0 failed with -2
[ 0.096002] NMI watchdog: Perf NMI watchdog permanently disabled
[ 0.100049] smp: Bringing up secondary CPUs ...
[ 0.103696] x86: Booting SMP configuration:
[ 0.104003] .... node #0, CPUs: #1
[ 0.004000] kvm-clock: cpu 1, msr 0:3e357041, secondary cpu clock
[ 0.106853] KVM setup async PF for cpu 1
[ 0.107214] kvm-stealtime: cpu 1, msr 3e1161c0
[ 0.112307] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[ 0.116006] TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
[ 0.120007] smp: Brought up 1 node, 2 CPUs
[ 0.123417] smpboot: Total of 2 processors activated (10000.00 BogoMIPS)
[ 0.124320] devtmpfs: initialized
[ 0.126970] x86/mm: Memory block size: 128MB
[ 0.128137] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.132008] futex hash table entries: 512 (order: 3, 32768 bytes)
[ 0.136156] NET: Registered protocol family 16
[ 0.139769] cpuidle: using governor ladder
[ 0.140013] cpuidle: using governor menu
[ 0.143281] ACPI: bus type PCI registered
[ 0.144000] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[ 0.148144] PCI: Using configuration type 1 for base access
[ 0.156770] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[ 0.160017] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[ 0.164044] ACPI: Added _OSI(Module Device)
[ 0.168007] ACPI: Added _OSI(Processor Device)
[ 0.172007] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 0.176004] ACPI: Added _OSI(Processor Aggregator Device)
[ 0.180007] ACPI: Interpreter enabled
[ 0.184011] ACPI: (supports S0 S4 S5)
[ 0.187094] ACPI: Using IOAPIC for interrupt routing
[ 0.188018] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 0.300750] ACPI: Enabled 16 GPEs in block 00 to 0F
[ 0.308023] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 0.312007] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
[ 0.316010] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
[ 0.320007] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[ 0.328324] acpiphp: Slot [3] registered
[ 0.420040] acpiphp: Slot [31] registered
[ 0.424003] PCI host bridge to bus 0000:00
[ 0.536451] pci 0000:00:03.0: vgaarb: setting as boot VGA device
[ 0.540000] pci 0000:00:03.0: vgaarb: VGA device added: decodes=io mem,owns=io mem,locks=none
[ 0.548009] pci 0000:00:03.0: vgaarb: bridge control possible
[ 0.551996] vgaarb: loaded
[ 0.556090] EDAC MC: Ver: 3.0.0
[ 0.559140] PCI: Using ACPI for IRQ routing
[ 0.560280] NetLabel: Initializing
[ 0.563268] NetLabel: domain hash size = 128
[ 0.568019] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 0.571902] NetLabel: unlabeled traffic allowed by default
[ 0.576145] clocksource: Switched to clocksource kvm-clock
[ 0.586755] VFS: Disk quotas dquot_6.6.0
[ 0.590090] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[ 0.594562] pnp: PnP ACPI init
[ 0.597855] pnp: PnP ACPI: found 5 devices
[ 0.608231] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[ 0.614881] NET: Registered protocol family 2
[ 0.618324] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
[ 0.622749] TCP bind hash table entries: 8192 (order: 5, 131072 bytes)
[ 0.626965] TCP: Hash tables configured (established 8192 bind 8192)
[ 0.631170] UDP hash table entries: 512 (order: 2, 16384 bytes)
[ 0.635163] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
[ 0.639358] NET: Registered protocol family 1
[ 0.642779] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[ 0.646797] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[ 0.651113] pci 0000:00:03.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[ 0.657825] Unpacking initramfs...
[ 0.734208] Freeing initrd memory: 31076K
[ 0.737636] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x240939f1bb2, max_idle_ns: 440795263295 ns
[ 0.745181] Scanning for low memory corruption every 60 seconds
[ 0.750602] audit: initializing netlink subsys (disabled)
[ 0.754606] audit: type=2000 audit(1603879247.564:1): state=initialized audit_enabled=0 res=1
[ 0.754917] Initialise system trusted keyrings
[ 0.764927] Key type blacklist registered
[ 0.768266] workingset: timestamp_bits=36 max_order=18 bucket_order=0
[ 0.773861] zbud: loaded
[ 0.905903] Key type asymmetric registered
[ 0.909292] Asymmetric key parser 'x509' registered
[ 0.912915] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[ 0.918972] io scheduler noop registered (default)
[ 0.922543] io scheduler cfq registered
[ 0.925904] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
[ 0.964594] crc32c_combine: 8373 self tests passed
[ 0.968628] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 1.000785] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[ 1.007649] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[ 1.014310] i8042: Warning: Keylock active
[ 1.018572] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 1.022414] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 1.026284] rtc_cmos 00:00: RTC can wake from S4
[ 1.030475] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0
[ 1.034755] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram
[ 1.038955] hidraw: raw HID events driver (C) Jiri Kosina
[ 1.042936] NET: Registered protocol family 17
[ 1.046622] mce: Using 32 MCE banks
[ 1.049627] sched_clock: Marking stable (1049607566, 0)->(1755024155, -705416589)
[ 1.056014] registered taskstats version 1
[ 1.059279] Loading compiled-in X.509 certificates
[ 1.064832] Loaded X.509 cert 'Build time autogenerated kernel key: 121ffea65ca15230f4a21fe7e5b65abaabaa433c'
[ 1.072013] zswap: loaded using pool lzo/zbud
[ 1.075526] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
[ 1.079746] ima: Allocated hash algorithm: sha1
[ 1.083589] rtc_cmos 00:00: setting system clock to 2020-10-28 09:59:31 UTC (1603879171)
[ 1.091820] Freeing unused kernel memory: 2088K
[ 1.116102] Write protecting the kernel read-only data: 16384k
[ 1.120697] Freeing unused kernel memory: 2016K
[ 1.126528] Freeing unused kernel memory: 1316K
[ 1.160972] systemd[1]: Inserted module 'autofs4'
[ 1.176133] NET: Registered protocol family 10
[ 1.181508] Segment Routing with IPv6
[ 1.184828] systemd[1]: Inserted module 'ipv6'
[ 1.189116] random: systemd: uninitialized urandom read (16 bytes read)
[ 1.193763] random: systemd: uninitialized urandom read (16 bytes read)
[ 1.198171] random: systemd: uninitialized urandom read (16 bytes read)
[ 1.205354] systemd[1]: systemd 219 running in system mode. ( PAM AUDIT SELINUX IMA -APPARMOR SMACK SYSVINIT UTMP LIBCRYPTSETUP GCRYPT GNUTLS ACL XZ LZ4 -SECCOMP BLKID ELFUTILS KMOD IDN)
[ 1.217384] systemd[1]: Detected virtualization kvm.
[ 1.221077] systemd[1]: Detected architecture x86-64.
[ 1.224774] systemd[1]: Running in initial RAM disk.
Welcome to Amazon Linux 2 dracut-033-535.amzn2.1.3 (Initramfs)
[ 1.230712] systemd[1]: No hostname configured.
[ 1.234213] systemd[1]: Set hostname to <localhost>.
[ 1.237934] systemd[1]: Initializing machine ID from KVM UUID.
[ OK ] Reached target Swap.
[ 1.265844] systemd[1]: Reached target Swap.
[ 1.269312] systemd[1]: Starting Swap.
[ OK ] Created slice Root Slice.
[ 1.274036] systemd[1]: Created slice Root Slice.
[ OK ] Listening on Journal Socket.
[ OK ] Reached target Timers.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Reached target Local File Systems.
[ OK ] Listening on udev Control Socket.
[ OK ] Created slice System Slice.
Starting Setup Virtual Console...
Starting Journal Service...
Starting Create list of required st... nodes for the current kernel...
Starting Apply Kernel Variables...
[ OK ] Reached target Slices.
[ OK ] Listening on udev Kernel Socket.
[ OK ] Reached target Sockets.
Starting dracut cmdline hook...
[ OK ] Started Setup Virtual Console.
[ OK ] Started Create list of required sta...ce nodes for the current kernel.
[ OK ] Started Apply Kernel Variables.
Starting Create Static Device Nodes in /dev...
[ OK ] Started Create Static Device Nodes in /dev.
[ OK ] Started Journal Service.
[ OK ] Started dracut cmdline hook.
Starting dracut pre-udev hook...
[ 1.390579] device-mapper: uevent: version 1.0.3
[ 1.394255] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel@redhat.com
[ OK ] Started dracut pre-udev hook.
Starting udev Kernel Device Manager...
[ OK ] Started udev Kernel Device Manager.
Starting dracut pre-trigger hook...
[ OK ] Started dracut pre-trigger hook.
Starting udev Coldplug all Devices...
[ OK ] Started udev Coldplug all Devices.
Starting Show Plymouth Boot Screen...
[ OK ] Reached target System Initialization.
Starting dracut initqueue hook...
[ 1.534629] nvme nvme0: pci function 0000:00:04.0
[ OK ] Started Show Plymouth Boot Screen.
[ OK ] Reached target Paths.
[ OK ] Reached target Basic System.
[ 1.543815] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11
[ 1.546543] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[ 1.556607] nvme nvme1: pci function 0000:00:1f.0
[ 1.557854] ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
[ 1.576394] AVX2 version of gcm_enc/dec engaged.
[ 1.580503] AES CTR mode by8 optimization enabled
[ 1.601321] alg: No test for pcbc(aes) (pcbc-aes-aesni)
[ 1.776495] nvme0n1: p1 p128
[ 1.908576] random: fast init done
[ OK ] Found device /dev/disk/by-uuid/a1e1011e-e38f-408e-878b-fed395b47ad6.
Starting File System Check on /dev/...e-e38f-408e-878b-fed395b47ad6...
[ OK ] Started File System Check on /dev/d...11e-e38f-408e-878b-fed395b47ad6.
[ OK ] Started dracut initqueue hook.
[ OK ] Reached target Remote File Systems (Pre).
[ OK ] Reached target Remote File Systems.
Starting dracut pre-mount hook...
[ OK ] Started dracut pre-mount hook.
Mounting /sysroot...
[ 2.235770] SGI XFS with ACLs, security attributes, no debug enabled
[ 2.242333] XFS (nvme0n1p1): Mounting V5 Filesystem
[ 4.142597] XFS (nvme0n1p1): Ending clean mount
[ OK ] Mounted /sysroot.
[ OK ] Reached target Initrd Root File System.
Starting Reload Configuration from the Real Root...
[ OK ] Started Reload Configuration from the Real Root.
[ OK ] Reached target Initrd File Systems.
[ OK ] Reached target Initrd Default Target.
Starting dracut pre-pivot and cleanup hook...
[ OK ] Started dracut pre-pivot and cleanup hook.
Starting Cleaning Up and Shutting Down Daemons...
[ OK ] Stopped Cleaning Up and Shutting Down Daemons.
[ OK ] Stopped target Timers.
[ OK ] Stopped dracut pre-pivot and cleanup hook.
Stopping dracut pre-pivot and cleanup hook...
[ OK ] Stopped target Remote File Systems.
[ OK ] Stopped target Remote File Systems (Pre).
[ OK ] Stopped target Initrd Default Target.
Starting Plymouth switch root service...
[ OK ] Stopped dracut pre-mount hook.
Stopping dracut pre-mount hook...
[ OK ] Stopped dracut initqueue hook.
Stopping dracut initqueue hook...
[ OK ] Stopped target Basic System.
[ OK ] Stopped target Sockets.
[ OK ] Stopped target System Initialization.
[ OK ] Stopped target Swap.
[ OK ] Stopped target Local File Systems.
[ OK ] Stopped Apply Kernel Variables.
Stopping Apply Kernel Variables...
[ OK ] Stopped target Local Encrypted Volumes.
[ OK ] Stopped udev Coldplug all Devices.
Stopping udev Coldplug all Devices...
[ OK ] Stopped dracut pre-trigger hook.
Stopping dracut pre-trigger hook...
Stopping udev Kernel Device Manager...
[ OK ] Stopped target Slices.
[ OK ] Stopped target Paths.
[ OK ] Stopped udev Kernel Device Manager.
[ OK ] Stopped Create Static Device Nodes in /dev.
Stopping Create Static Device Nodes in /dev...
[ OK ] Stopped Create list of required sta...ce nodes for the current kernel.
Stopping Create list of required st... nodes for the current kernel...
[ OK ] Stopped dracut pre-udev hook.
Stopping dracut pre-udev hook...
[ OK ] Stopped dracut cmdline hook.
Stopping dracut cmdline hook...
[ OK ] Closed udev Kernel Socket.
[ OK ] Closed udev Control Socket.
Starting Cleanup udevd DB...
[ OK ] Started Cleanup udevd DB.
[ OK ] Reached target Switch Root.
[ 4.553875] systemd-journald[667]: Received SIGTERM from PID 1 (systemd).
[ OK ] Started Plymouth switch root service.
Starting Switch Root...
[ 4.885212] systemd: 30 output lines suppressed due to ratelimiting
[ 5.925390] SELinux: Disabled at runtime.
[ 5.980115] audit: type=1404 audit(1603879176.396:2): selinux=0 auid=4294967295 ses=4294967295
[ 6.083250] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 6.106470] systemd[1]: Inserted module 'ip_tables'
Welcome to Amazon Linux 2
[ OK ] Stopped Switch Root.
[ OK ] Stopped Journal Service.
Starting Journal Service...
[ OK ] Reached target Swap.
[ OK ] Listening on Delayed Shutdown Socket.
Mounting Huge Pages File System...
[ OK ] Stopped target Switch Root.
[ OK ] Stopped target Initrd Root File System.
[ OK ] Created slice system-getty.slice.
[ OK ] Listening on udev Control Socket.
[ OK ] Listening on Device-mapper event daemon FIFOs.
[ OK ] Created slice User and Session Slice.
Starting Create list of required st... nodes for the current kernel...
[ OK ] Listening on LVM2 poll daemon socket.
[ OK ] Stopped target Initrd File Systems.
[ OK ] Listening on udev Kernel Socket.
Mounting Debug File System...
[ OK ] Reached target Slices.
[ OK ] Listening on LVM2 metadata daemon socket.
Mounting POSIX Message Queue File System...
[ OK ] Created slice system-selinuxx2dpol...gratex2dlocalx2dchanges.slice.
Starting Monitoring of LVM2 mirrors... dmeventd or progress polling...
[ OK ] Created slice system-serialx2dgetty.slice.
Starting Read and set NIS domainname from /etc/sysconfig/network...
[ OK ] Listening on /dev/initctl Compatibility Named Pipe.
[ OK ] Set up automount Arbitrary Executab...ats File System Automount Point.
Starting Remount Root and Kernel File Systems...
[ OK ] Started Journal Service.
[ OK ] Mounted Debug File System.
[ OK ] Mounted POSIX Message Queue File System.
[ OK ] Mounted Huge Pages File System.
[ OK ] Started Create list of required sta...ce nodes for the current kernel.
[ OK ] Started Remount Root and Kernel File Systems.
[ OK ] Started Read and set NIS domainname from /etc/sysconfig/network.
Starting udev Coldplug all Devices...
Starting Configure read-only root support...
Starting Relabel kernel modules early in the boot, if needed...
Starting Create Static Device Nodes in /dev...
Starting Flush Journal to Persistent Storage...
[ OK ] Started Relabel kernel modules early in the boot, if needed.
Starting Load Kernel Modules...
[ 7.047237] systemd-journald[1398]: Received request to flush runtime journal from PID 1
[ 7.069936] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.2.10g
[ 7.084119] ena: ena device version: 0.10
[ 7.089001] ena: ena controller version: 0.0.1 implementation version 1
[ OK ] Started Configure read-only root support.
Starting Load/Save Random Seed...
[ OK ] Started Load/Save Random Seed.
[ 7.156042] ena 0000:00:05.0: LLQ is not supported Fallback to host mode policy.
[ OK ] Started udev Coldplug all Devices.
Starting udev Wait for Complete Device Initialization...
[ 7.181318] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at mem febf4000, mac addr 0a:cf:65:4e:dd:ff
[ OK ] Started Load Kernel Modules.
Starting Apply Kernel Variables...
[ OK ] Started LVM2 metadata daemon.
Starting LVM2 metadata daemon...
[ OK ] Started Apply Kernel Variables.
[ OK ] Started Create Static Device Nodes in /dev.
Starting udev Kernel Device Manager...
[ OK ] Started Flush Journal to Persistent Storage.
[ OK ] Started udev Kernel Device Manager.
[ OK ] Found device /dev/ttyS0.
[ 7.776329] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3
[ 7.783413] ACPI: Power Button [PWRF]
[ 7.786723] input: Sleep Button as /devices/LNXSYSTM:00/LNXSLPBN:00/input/input4
[ 7.793032] ACPI: Sleep Button [SLPF]
Starting Relabel kernel modules early in the boot, if needed...
[ OK ] Created slice system-ec2netx2difup.slice.
[ OK ] Started Relabel kernel modules early in the boot, if needed.
[ 7.888784] input: ImPS/2 Generic Wheel Mouse as /devices/platform/i8042/serio1/input/input5
[ 7.904661] mousedev: PS/2 mouse device common for all mice
[ OK ] Started udev Wait for Complete Device Initialization.
Starting Activation of DM RAID sets...
[ OK ] Started Activation of DM RAID sets.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Started Monitoring of LVM2 mirrors,...ng dmeventd or progress polling.
[ OK ] Reached target Local File Systems (Pre).
[ 59.305661] random: crng init done
[ 59.308921] random: 7 urandom warning(s) missed due to ratelimiting
[ TIME ] Timed out waiting for device dev-sdf.device.
[DEPEND] Dependency failed for /home/storage.
[DEPEND] Dependency failed for Local File Systems.
[DEPEND] Dependency failed for Mark the need to relabel after reboot.
[DEPEND] Dependency failed for Relabel all filesystems, if necessary.
[DEPEND] Dependency failed for Migrate local... structure to the new structure.
Starting Preprocess NFS configuration...
[ OK ] Reached target Timers.
[ OK ] Reached target Network (Pre).
[ OK ] Reached target Login Prompts.
[ OK ] Reached target Cloud-init target.
Starting Initial hibernation setup job...
Starting Initial cloud-init job (metadata service crawler)...
[ OK ] Reached target Network.
[ OK ] Reached target Paths.
[ OK ] Reached target Sockets.
Starting Create Volatile Files and Directories...
[ OK ] Started Emergency Shell.
Starting Emergency Shell...
[ OK ] Reached target Emergency Mode.
Starting Tell Plymouth To Write Out Runtime Data...
[ OK ] Started Preprocess NFS configuration.
[ OK ] Started Create Volatile Files and Directories.
Mounting RPC Pipe File System...
Starting Security Auditing Service...
Starting RPC bind service...
[ 97.160193] RPC: Registered named UNIX socket transport module.
[ 97.160194] RPC: Registered udp transport module.
[ 97.160194] RPC: Registered tcp transport module.
[ 97.160195] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ OK ] Mounted RPC Pipe File System.
[ OK ] Reached target rpc_pipefs.target.
[ OK ] Reached target NFS client services.
[ OK ] Reached target Remote File Systems (Pre).
[ OK ] Reached target Remote File Systems.
[ OK ] Started Tell Plymouth To Write Out Runtime Data.
[ OK ] Started RPC bind service.
[ OK ] Started Security Auditing Service.
Starting Update UTMP about System Boot/Shutdown...
[ OK ] Started Update UTMP about System Boot/Shutdown.
Starting Update UTMP about System Runlevel Changes...
[ OK ] Started Update UTMP about System Runlevel Changes.
[ 99.871085] hibinit-agent[1855]: Traceback (most recent call last):
[ 99.871339] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 496, in <module>
[ 99.871592] hibinit-agent[1855]: main()
[ 99.872080] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 435, in main
[ 99.872516] hibinit-agent[1855]: if not hibernation_enabled(config.state_dir):
[ 99.873017] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 390, in hibernation_enabled
[ 99.873487] hibinit-agent[1855]: imds_token = get_imds_token()
[ 99.873793] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 365, in get_imds_token
[ 99.875332] hibinit-agent[1855]: response = requests.put(token_url, headers=request_header)
[ 99.877065] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/api.py", line 121, in put
[ 99.877230] hibinit-agent[1855]: return request('put', url, data=data, **kwargs)
[ 99.877959] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request
[ 99.878225] hibinit-agent[1855]: response = session.request(method=method, url=url, **kwargs)
[ 99.878614] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 486, in request
[ 99.879747] hibinit-agent[1855]: resp = self.send(prep, **send_kwargs)
[ 99.880157] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 598, in send
[ 99.884411] hibinit-agent[1855]: r = adapter.send(request, **kwargs)
[ 99.884728] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 419, in send
[ 99.892094] hibinit-agent[1855]: raise ConnectTimeout(e, request=request)
[ 99.892377] hibinit-agent[1855]: requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7efc029fa390>: Failed to establish a new connection: [Errno 101] Network is unreachable',))
[FAILED] Failed to start Initial hibernation setup job.
See 'systemctl status hibinit-agent.service' for details.
[ 101.215791] cloud-init[1856]: Cloud-init v. 19.3-3.amzn2 running 'init' at Wed, 28 Oct 2020 10:01:11 0000. Up 101.18 seconds.
[ 101.264707] cloud-init[1856]: ci-info: Net device info
[ 101.264940] cloud-init[1856]: ci-info: -------- ------- ----------- ----------- ------- -------------------
[ 101.272469] cloud-init[1856]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
[ 101.274166] cloud-init[1856]: ci-info: -------- ------- ----------- ----------- ------- -------------------
[ 101.274497] cloud-init[1856]: ci-info: | eth0 | False | . | . | . | 0a:cf:65:4e:dd:ff |
[ 101.284890] cloud-init[1856]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
[ 101.286727] cloud-init[1856]: ci-info: | lo | True | ::1/128 | . | host | . |
[ 101.286986] cloud-init[1856]: ci-info: -------- ------- ----------- ----------- ------- -------------------
[ 101.291933] cloud-init[1856]: ci-info: Route IPv6 info
[ 101.292215] cloud-init[1856]: ci-info: ------- ------------- --------- ----------- -------
[ 101.294122] cloud-init[1856]: ci-info: | Route | Destination | Gateway | Interface | Flags |
[ 101.294383] cloud-init[1856]: ci-info: ------- ------------- --------- ----------- -------
[ 101.294543] cloud-init[1856]: ci-info: ------- ------------- --------- ----------- -------
Welcome to emerg
Cannot open access to console, the root account is locked.
See sulogin(8) man page for more details.
Press Enter to continue.
Ответ №1:
Хорошо, вскоре после публикации мы разобрались. Похоже, что точка монтирования изменилась (я ожидаю из-за обновления ядра Linux), и мы не использовали опцию nofail в /etc/fstab, как описано в центре знаний aws, это привело к зависанию сервера при загрузке.
В дальнейшем мы также обеспечим использование монтирования UUID, чтобы не зависеть от именования устройства в /dev/.
Комментарии:
1. Это случилось со мной, и мы использовали именование UUID. В нашем случае мы переместили файловую систему с одной виртуальной машины на другую и забыли удалить запись
/etc/fstab
. Упс. Похожеnofail
, что это должно быть моей политикой по умолчанию для всего,/etc/fstab
что не является корневой fs. Все остальное должно быть исправлено с помощьюssh
, что требует …ssh
для фактического запуска.
Ответ №2:
Я думаю, что я сузил это до ec2-utils
пакета. У нас была та же проблема, связанная с неправильным подключением устройств, которые, как мы первоначально думали, были связаны с драйвером ENA или NVMe. Как только мы запустили обновление yum, оно было устранено.
При понижении ec2-utils
версии пакета ec2-utils-1.2-2.amzn2
проблема возвращается. Похоже, это влияет только на экземпляры на основе nitro. Чтобы исправить это, вы можете временно загрузиться как экземпляр t2 или другого более старого типа и обновить пакет.
Комментарии:
1. Да, это вполне может быть основной причиной нашего сбоя. Просматривая нашу историю yum для ec2-утилит, я вижу, что мы получили 1.2-2 утром в день нашего сбоя, а через два дня был выпущен 1.2-3, который, похоже, решает проблему. Хороший улов.