Hi,
I bought a nuc, and it was working fine for about a week. However now it has started freezing up if I leave it alone for a while. The last thing I see in dmesg is appended below. I found the following topic regarding 525 120gb overheating issues
To verify, I ran a smartctl -A /dev/sda | grep Temp and found that for some heavy disk work (endlessly copying a 3gb large file), the temperature goes up to 91C.
190 Airflow_Temperature_Cel 0x0022 093 105 000 Old_age Always - 93 (Min/Max 21/105)
Could overheating be my problem? Full smartctl -x output last in post.
dmesg when crashing:
[55925.270031] ata1: exception Emask 0x10 SAct 0x0 SErr 0x4090000 action 0xe frozen
[55925.270158] ata1: irq_stat 0x00400040, connection status changed
[55925.270244] ata1: SError: { PHYRdyChg 10B8B DevExch }
[55925.270319] ata1: hard resetting link
[55925.992782] ata1: SATA link down (SStatus 0 SControl 300)
[55930.987241] ata1: hard resetting link
[55931.306960] ata1: SATA link down (SStatus 0 SControl 300)
[55931.306989] ata1: limiting SATA link speed to 1.5 Gbps
[55936.301463] ata1: hard resetting link
[55936.621066] ata1: SATA link down (SStatus 0 SControl 310)
[55936.621084] ata1.00: disabled
[55936.621114] ata1: EH complete
[55936.621139] sd 0:0:0:0: rejecting I/O to offline device
[55936.621221] sd 0:0:0:0: [sda] killing request
[55936.621318] ata1.00: detaching (SCSI 0:0:0:0)
[55936.621668] Aborting journal on device dm-0-8.
[55936.621772] Buffer I/O error on device dm-0, logical block 12615680
[55936.621851] lost page write due to I/O error on dm-0
[55936.621867] JBD2: Error -5 detected when updating journal superblock for dm-0-8.
[55936.624745] EXT2-fs (sda1): previous I/O error to superblock detected
[55936.624745]
[55936.625205] Buffer I/O error on device dm-0, logical block 0
[55936.625309] lost page write due to I/O error on dm-0
[55936.625318] EXT4-fs error (device dm-0): ext4_journal_start_sb:349: Detected aborted journal
[55936.625461] EXT4-fs (dm-0): Remounting filesystem read-only
[55936.625534] EXT4-fs (dm-0): previous I/O error to superblock detected
[55936.625626] Buffer I/O error on device dm-0, logical block 0
[55936.625699] lost page write due to I/O error on dm-0
[55936.626463] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[55936.626513] sd 0:0:0:0: [sda]
[55936.626517] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[55936.626521] sd 0:0:0:0: [sda] Stopping disk
[55936.626535] sd 0:0:0:0: [sda] START_STOP FAILED
[55936.626539] sd 0:0:0:0: [sda]
[55936.626541] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
full smartctl -x output:
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.8.0-19-generic] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: INTEL SSDMCEAC120B3
Serial Number: CVLI3022059Z120E
LU WWN Device Id: 5 001517 803d9862e
Firmware Version: LLKi
User Capacity: 120,034,123,776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ACS-2 (unknown minor revision code: 0xffff)
Local Time is: Thu May 9 08:49:36 2013 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM level is: 254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2930) seconds.
Offline data collection
capabilities: (0x7f) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Abort Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 48) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0021) SCT Status supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
5 Reallocated_Sector_Ct -O--CK 100 100 000 - 0
9 Power_On_Hours -O--CK 100 100 000 - 383 (172 140 0)
12 Power_Cycle_Count -O--CK 100 100 000 - 21
170 Unknown_Attribute PO--CK 100 100 010 - 0
171 Unknown_Attribute -O--CK 100 100 000 - 0
172 Unknown_Attribute -O--CK 100 100 000 - 0
174 Unknown_Attribute -O--CK 100 100 000 - 19
183 Runtime_Bad_Block -O--CK 100 100 000 - 0
184 End-to-End_Error PO--CK 100 100 090 - 0
187 Reported_Uncorrect -O--CK 100 100 050 - 0
190 Airflow_Temperature_Cel -O---K 093 105 000 - 93 (Min/Max 21/105)
192 Power-Off_Retract_Count -O--CK 100 100 000 - 19
199 UDMA_CRC_Error_Count -O--CK 100 100 000 - 0
225 Load_Cycle_Count -O--CK 100 100 000 - 3922
226 Load-in_Time -O--CK 100 100 000 - 65535
227 Torq-amp_Count -O--CK 100 100 000 - 4
228 Power-off_Retract_Count -O--CK 100 100 000 - 65535
232 Available_Reservd_Space PO--CK 100 100 010 - 0
233 Media_Wearout_Indicator -O--CK 100 100 000 - 0
241 Total_LBAs_Written -O--CK 100 100 000 - 3922
242 Total_LBAs_Read -O--CK 100 100 000 - 176
249 Unknown_Attribute PO--C- 100 100 000 - 79
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
GP/S Log at address 0x00 has 1 sectors [Log Directory]
GP/S Log at address 0x04 has 1 sectors [Device Statistics log]
SMART Log at address 0x06 has 1 sectors [SMART self-test log]
GP Log at address 0x07 has 1 sectors [Extended self-test log]
SMART Log at address 0x09 has 1 sectors [Selective self-test log]
GP Log at address 0x10 has 1 sectors [NCQ Command Error log]
GP/S Log at address 0x11 has 1 sectors [SATA Phy Event Counters]
GP/S Log at address 0x80 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x81 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x82 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x83 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x84 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x85 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x86 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x87 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x88 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x89 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x8a has 16 sectors [Host vendor specific log]
GP/S Log at address 0x8b has 16 sectors [Host vendor specific log]
GP/S Log at address 0x8c has 16 sectors [Host vendor specific log]
GP/S Log at address 0x8d has 16 sectors [Host vendor specific log]
GP/S Log at address 0x8e has 16 sectors [Host vendor specific log]
GP/S Log at address 0x8f has 16 sectors [Host vendor specific log]
GP/S Log at address 0x90 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x91 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x92 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x93 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x94 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x95 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x96 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x97 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x98 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x99 has 16 sectors [Host vendor specific log]
GP/S Log at address 0x9a has 16 sectors [Host vendor specific log]
GP/S Log at address 0x9b has 16 sectors [Host vendor specific log]
GP/S Log at address 0x9c has 16 sectors [Host vendor specific log]
GP/S Log at address 0x9d has 16 sectors [Host vendor specific log]
GP/S Log at address 0x9e has 16 sectors [Host vendor specific log]
GP/S Log at address 0x9f has 16 sectors [Host vendor specific log]
GP/S Log at address 0xb7 has 16 sectors [Device vendor specific log]
GP/S Log at address 0xe0 has 1 sectors [SCT Command/Status]
GP/S Log at address 0xe1 has 1 sectors [SCT Data Transfer]
SMART Extended Comprehensive Error Log (GP Log 0x03) not supported
SMART Error Log not supported
SMART Extended Self-test Log Version: 0 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 367 -
Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Error unknown SCT Temperature History Format Version (0), should be 2.
Warning: device does not support SCT Error Recovery Control command
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 10 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 8 Device-to-host register FISes sent due to a COMRESET
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC
0x0002 2 0 R_ERR response for data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FIS