network install はまったので覚書
Centos network installのCDがあったのでpe750をCDより起動して
サーバーよりデータを取ってくる段階で
The CentOS installation tree in that directory does not seem to match your boot media.
と怒られて前に進まず…
http://bugs.centos.org/view.php?id=2937
でもおなじようなことが書いてあり、どうやらmd5sumが違うから動かないということなので
http://ftp2.riken.jp/Linux/centos/5/isos/i386/CentOS-5.2-i386-netinstall.iso
をダウンロードしてCDに焼き、やり直したところ無事installできた。
どうやらnetwork installのCDが5.1だったぽくだめだった模様。
CDには同じ轍を踏まないために焼いた日時とcentos 5.2とマジックで書いといた
これから5.2のinstallをしてないってことかぁ
http://lists.centos.org/pipermail/centos-announce/2008-June/014999.html
PowerEdge SC1435 BIOS changed
2007/10/10 1.2.12
This BIOS adds support for Quad-Core AMD Opteron(R) Processor of 2000 series.
Fixes:
Fixed potential issue where an extra ECC error may appear in the System
Event Log after a MBE event occurs.Enhancements:
Added support for Quad-Core AMD Opteron(R) processors of 2000 series.
Enabled the DIMM address parity check for Quad-Core AMD Opteron(R)
processors of 2000 series.
Added the AMD Memory Optimizer Technology option in BIOS setup for
Quad-Core AMD Opteron(R) processors.
Updated BMC-BIOS Binary Option ROM to version 1.05.
Added support for OROMs than need more than 1MB of OROM space.
Added AMD Virtualization support for Dual-Core AMD Opteron(R) processors.
PowerEdge 750 BIOS changed
2004/12/07 A03
Added Intel(R) Celeron(R) Processor (533 Mhz) with 256K Cache support.
Added Intel(R) Pentium(R) 4 with 1M Cache Processor E0 Stepping Microcode(Patch ID=09).
Added Intel(R) Pentium(R) 4 with 1M Cache Processor D0 Stepping Microcode(Patch ID=13).
Added Intel(R) Pentium(R) 4 with 512K Cache Processor D1 Stepping Microcode(Patch ID=2E).
Increased Read timeout for certain optical drives.
2005/03/28 A04
Added Intel(R) Pentium(R) 4 with 1M Cache Processor E0 Stepping Microcode(Patch ID=12).
Added Intel(R) Pentium(R) 4 with 1M Cache Processor D0 Stepping Microcode(Patch ID=14).
2005/08/25 A05
Added Intel(R) Pentium(R) 4 with 1M Cache Processor E0 Stepping Microcode(Patch ID=17).
Added Intel(R) Pentium(R) 4 with 1M Cache Processor D0 Stepping Microcode(Patch ID=17).
Added Intel(R) Pentium(R) 4 with 1M Cache Processor C0 Stepping Microcode(Patch ID=0C).
2006/02/06 A06
Added Intel(R) Pentium(R) 4 with 1M Cache Processor G1 Stepping Microcode (Patch ID=03).
Added ATA hard disk Security Freeze feature.
Fixed the Unreported IO failure in HCT 12.1 with WS03 SP1.
downloadは下記より
Debian etchにlm_sensorsを入れる
CPU外したときに下のプラスチックの爪受けが壊れ、一日止める
熱でもろくなっていたようだ…
CPU温度上昇は/var/log/messagesに出ていたのでlogwatchを精読していれば発見できる
Sep 9 22:30:14 debian kernel: CPU1: Temperature/speed normal Sep 9 22:30:14 debian kernel: CPU0: Temperature/speed normal Sep 9 22:30:25 debian kernel: CPU0: Temperature/speed normal Sep 9 22:30:25 debian kernel: CPU1: Temperature/speed normal
lm_sensors install package名がlm-sensorsなのがわな?
apt-get install lm-sensors
sensors-detect
sensors-detectはそのまま選び設定保存後にreboot
$ sensors w83627thf-isa-0290 Adapter: ISA adapter VCore: +1.35 V (min = +0.70 V, max = +1.87 V)
- 12V: +12.10 V (min = +10.64 V, max = +7.11 V) ALARM
- 3.3V: +3.34 V (min = +2.22 V, max = +3.86 V)
- 5V: +5.07 V (min = +1.71 V, max = +5.81 V)
- 12V: -7.01 V (min = -6.85 V, max = -14.25 V) ALARM
他のDebian etchでも取得できるのを確認。muninの設定に追加しておいた
ln -s /usr/share/munin/plugins/sensors_ /etc/munin/plugins/sensors_fan
ln -s /usr/share/munin/plugins/sensors_ /etc/munin/plugins/sensors_temp
ln -s /usr/share/munin/plugins/sensors_ /etc/munin/plugins/sensors_volt
/etc/init.d/munin-node stop
/etc/init.d/munin-node start
■
社内のDebianがみたことないエラーを吐くので
Message from syslogd@debian at Wed Sep 10 21:30:13 2008 ... debian kernel: CPU1: Temperature above threshold Message from syslogd@debian at Wed Sep 10 21:30:13 2008 ... debian kernel: CPU0: Running in modulated clock mode Message from syslogd@debian at Wed Sep 10 21:30:13 2008 ... debian kernel: CPU1: Running in modulated clock mode Message from syslogd@debian at Wed Sep 10 21:30:13 2008 ... debian kernel: CPU0: Temperature above threshold
ホコリでFANとかいろいろ詰まり気味だったので掃除
HDD死亡のメモ(/dev/hda) IDE
Logwatchでの検出
--------------------- Kernel Begin ------------------------ WARNING: Kernel Errors Present EXT3-fs error (device ide0(3,7 ...: 384 Time(s) end_request: I/O error, dev 03:07 (hda) ...: 280 Time(s) hda: dma_intr: error=0x40 { Uncorrect ...: 280 Time(s) hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } ...: 280 Time(s) ---------------------- Kernel End -------------------------
smartmontoolsでのroot宛メール
Subject: SMART error (OfflineUncorrectableSector) detected on host: example.jp This email was generated by the smartd daemon running on: host name: example.jp DNS domain: my.domain NIS domain: (none) The following warning/error was logged by the smartd daemon: Device: /dev/hda, 1 Offline uncorrectable sectors For details see host's SYSLOG (default: /var/log/messages). You can also use the smartctl utility for further investigation. No additional email messages about this problem will be sent.
smartmontools Health check error
Subject: SMART error (Health) detected on host: example.jp This email was generated by the smartd daemon running on: host name: example.jp DNS domain: my.domain NIS domain: (none) The following warning/error was logged by the smartd daemon: Device: /dev/hda, FAILED SMART self-check. BACK UP DATA NOW! For details see host's SYSLOG (default: /var/log/messages). You can also use the smartctl utility for further investigation. No additional email messages about this problem will be sent.
取り外す前の最後のsmartの値(fsck後)
# /usr/sbin/smartctl -a /dev/hda smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus family Device Model: ST3120026AS Serial Number: 3JT46EGC Firmware Version: 3.18 User Capacity: 120,034,123,776 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 6 ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2 Local Time is: Wed Jul 23 06:15:37 2008 JST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 121) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection: ( 430) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 85) minutes. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 054 050 006 Pre-fail Always - 153397814 3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 2 7 Seek_Error_Rate 0x000f 086 060 030 Pre-fail Always - 449964338 9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 18942 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 7 194 Temperature_Celsius 0x0022 044 053 000 Old_age Always - 44 195 Hardware_ECC_Recovered 0x001a 054 050 000 Old_age Always - 153397814 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 127 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 127 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 SMART Error Log Version: 1 ATA Error Count: 2006 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 2006 occurred at disk power-on lifetime: 18941 hours (789 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 c2 32 ff e0 Error: UNC at LBA = 0x00ff32c2 = 16724674 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 24 00 02 c2 32 ff e0 00 21:20:57.513 READ SECTOR(S) EXT 24 00 18 88 33 ff e0 00 21:20:57.510 READ SECTOR(S) EXT 24 00 80 08 33 ff e0 00 21:20:57.506 READ SECTOR(S) EXT 24 00 02 c2 32 ff e0 00 21:20:57.502 READ SECTOR(S) EXT 24 00 44 c4 32 ff e0 00 21:20:57.499 READ SECTOR(S) EXT Error 2005 occurred at disk power-on lifetime: 18941 hours (789 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 c2 32 ff e0 Error: UNC at LBA = 0x00ff32c2 = 16724674 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 24 00 02 c2 32 ff e0 00 21:20:57.513 READ SECTOR(S) EXT 24 00 44 c4 32 ff e0 00 21:20:57.510 READ SECTOR(S) EXT 24 00 50 b8 32 ff e0 00 21:20:57.506 READ SECTOR(S) EXT 24 00 18 a0 58 fb e0 00 21:20:57.502 READ SECTOR(S) EXT 24 00 18 b0 57 fb e0 00 21:20:57.499 READ SECTOR(S) EXT Error 2004 occurred at disk power-on lifetime: 18941 hours (789 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 0a c2 32 ff e0 Error: UNC at LBA = 0x00ff32c2 = 16724674 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 24 00 50 b8 32 ff e0 00 21:20:57.513 READ SECTOR(S) EXT 24 00 18 a0 58 fb e0 00 21:20:57.510 READ SECTOR(S) EXT 24 00 18 b0 57 fb e0 00 21:20:57.506 READ SECTOR(S) EXT 24 00 18 c0 56 fb e0 00 21:20:57.502 READ SECTOR(S) EXT 24 00 18 d8 55 fb e0 00 21:20:57.499 READ SECTOR(S) EXT Error 2003 occurred at disk power-on lifetime: 18941 hours (789 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 c6 32 f7 e0 Error: UNC at LBA = 0x00f732c6 = 16200390 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 24 00 02 c6 32 f7 e0 00 21:20:12.376 READ SECTOR(S) EXT 24 00 02 c2 32 f7 e0 00 21:20:52.262 READ SECTOR(S) EXT 24 00 04 c0 32 f7 e0 00 21:20:48.435 READ SECTOR(S) EXT 24 00 18 88 33 f7 e0 00 21:20:41.941 READ SECTOR(S) EXT 24 00 80 08 33 f7 e0 00 21:20:41.937 READ SECTOR(S) EXT Error 2002 occurred at disk power-on lifetime: 18941 hours (789 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 c2 32 f7 e0 Error: UNC at LBA = 0x00f732c2 = 16200386 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 24 00 02 c2 32 f7 e0 00 21:20:12.376 READ SECTOR(S) EXT 24 00 04 c0 32 f7 e0 00 21:20:12.361 READ SECTOR(S) EXT 24 00 18 88 33 f7 e0 00 21:20:48.435 READ SECTOR(S) EXT 24 00 80 08 33 f7 e0 00 21:20:41.941 READ SECTOR(S) EXT 34 00 08 c0 32 bb e0 00 21:20:41.937 WRITE SECTORS(S) EXT SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 18938 188166888 # 2 Short offline Completed: read failure 90% 18914 188166888 # 3 Short offline Completed: read failure 90% 18891 188166888 # 4 Short offline Completed: read failure 90% 18867 188166888 # 5 Extended offline Completed: read failure 90% 18866 261165 # 6 Short offline Completed without error 00% 18843 - # 7 Short offline Completed without error 00% 18820 - # 8 Short offline Completed without error 00% 18796 - # 9 Short offline Completed without error 00% 18773 - #10 Short offline Completed without error 00% 18749 - #11 Short offline Completed without error 00% 18726 - #12 Short offline Completed without error 00% 18702 - #13 Extended offline Completed without error 00% 18702 - #14 Short offline Completed without error 00% 18678 - #15 Short offline Completed without error 00% 18655 - #16 Short offline Completed without error 00% 18632 - #17 Short offline Completed without error 00% 18608 - #18 Short offline Completed without error 00% 18584 - #19 Short offline Completed without error 00% 18561 - #20 Short offline Completed without error 00% 18537 - #21 Extended offline Completed without error 00% 18537 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
バックアップは大事だね
snort ルールの自動更新(oinkmaster)
https://www.snort.org/reg-bin/userprefs.cgi
でユーザー登録してログインしoinkcode発行する
apt-get install oinkmaster
vi /etc/oinkmaster.conf
url = http://www.snort.org/pub-bin/oinkmaster.cgi/ changeoinkcode /snortrules-snapshot-2.3.tar.gzmkdir /etc/snort/bak
/usr/sbin/oinkmaster -C /etc/oinkmaster.conf -o /etc/snort/rules -b /etc/snort/bak
上記コードをcronで自動実行するようにするとルールが自動更新される