Crypto Miner

Antminer miner common error code complete analysis and troubleshooting methods

- Antminer miner common error code complete analysis and troubleshooting methods

In Bitcoin mining, the stable operation of Bitmain Antminers is directly related to hashrate output and revenue returns. Error codes that appear during mining machine operation are visual indicators of hardware status, environmental adaptation, or configuration issues. This article systematically organizes common error codes by fault type, detailing their causes and tiered troubleshooting solutions to help miners quickly identify issues and minimize downtime losses.

1. Basic Operations: Obtaining Core Fault Information from Logs

All error codes are recorded in the kernel log of the mining machine, which serves as the primary basis for troubleshooting. To obtain the logs, follow these steps:

  1. Access the backend management interface using the mining machine’s IP address;
  2. Click “System” and find “Kernel Log”;
  3. Copy the log text and filter for entries containing “ERROR,” prioritizing the most prioritized fault points (in the case of multiple faults, the first fault is often the root cause). Key Tip: Do not blindly restart the mining machine – forced restart under certain faults (such as short circuit, low temperature) will cause irreversible damage. You need to first locate the type of problem through logs.

2. Temperature-Related Error Codes: Key Warnings for Environmental Adaptation

Temperature is the lifeline of a mining machine. Excessively high or low temperatures will trigger protection mechanisms. These error codes are highly common across all models of mining machines.

1. ERROR_TEMP_TOO_HIGH (High Temperature Protection)

Code Meaning: The core temperature of the mining machine has exceeded the safety threshold (usually ≥85°C), triggering an automatic shutdown protection. This is common across the entire mining machine series, including theS19、S21, and KS5.

Typical log: Sweep error string = P:1. ERROR_TEMP_TOO_HIGH: Overmax temperature.

Core Causes:

  • Blocked heat dissipation channels (dust, catkins, or insects accumulate on the heat sink);
  • Excessive air inlet temperature in the equipment room (above the recommended upper limit of 35°C);
  • Fan stalling or insufficient speed, resulting in reduced cooling efficiency.

Troubleshooting Steps:

  • Emergency Power Off: Prevent continued high temperatures from burning the chip. Wait until the machine cools down to below 40°C before operating again.
  • Dust Cleaning and Inspection: Use compressed air to clean dust from the heat sink and fans, ensuring that ventilation gaps are clear.
  • Environmental Optimization: Reduce the air inlet temperature in the equipment room (install industrial air conditioning, if possible) and ensure that the exhaust ducts are unobstructed.
  • Fan Test: Check the log for ERROR_FAN_LOST (fan lost). If so, troubleshoot the fan fault (see below).

2. ERROR_TEMP_TOO_LOW (low temperature protection)

  • Code Meaning: The ambient temperature is below the startup threshold (usually ≤ -20°C), and the mining machine cannot start normally. This often occurs in winter in high-latitude machine rooms.
  • Typical Log: Sweep error string = P:2. ERROR_TEMP_TOO_LOW: temp too low!
  • Core Cause: Low temperature causes abnormal conductivity in the chip circuit, and forced startup can easily cause a short circuit.
  • Troubleshooting Steps:
    1. Stop Start: Do not attempt to start the machine below -20°C.
    2. Ambient Heating: Raise the machine room temperature to above 0°C using a heater or air conditioner.
    3. Preheat the Miner: Apply localized heat to the machine to ensure that core components reach the specified temperature before starting.

3.Hash Board and Chip Error Codes: The Main Cause of Hashrate Loss

Hash boards and ASIC chips are the core computing units of mining machines. Related errors directly lead to hash power drops or shutdowns, and are particularly common in high-end models like the S19 and KS5.

1. Chip Missing Errors (0 Chips/Low Chip)

  • Code Meaning: The mining machine did not detect the designed number of ASIC chips. This error is categorized as “0 chips for the entire machine” or “Low chips per board.”
  • Typical logs:
    1.Chip 0 for the entire machine: Chain 0 only found 0 ASICs, will power off hash board 0;
    2.Low chips per board: Chain 0 only found 6 ASICs, will power off hash board 0 or Chain 1, ASIC 54, nonce 455 < 85% avg 541.
  • Core causes:
    1. Hash board short circuit or loose signal cable;
    2.Unstable power supply (voltage fluctuations causing chip activation failure);
    3.ASIC chip damage or signal transmission interruption (e.g., abnormal RO/RX signals).
  • Troubleshooting steps:
Fault Type Step 1 (Basic Troubleshooting) Step 2 (Advanced Testing) Step 3 (Professional Handling)
0 Chips in the Entire Machine Power off and check if the hashboard is short-circuited (use a multimeter to test for continuity). If no short circuit occurs, restart the mining machine. If a short circuit occurs, return the machine for repair. Return to the factory to replace the faulty hashboard.
Missing Chips on a Single Board Reseat the hashboard cable and replace it for testing. Replace the power supply and check for proper grounding. Use a test fixture to check the chip signal voltage and replace any damaged chips.
Hashboard Missing Check if the cables connecting the hashboard to the control board are securely plugged in. Replace the hashboards in a cross-connection to rule out any issues with the slots. Return to the factory to inspect the control board signal interface.

2. HAS_BOARDS_INCOMPLETE (Hashboard Incomplete)

  • Meaning: The miner detected fewer hashboards than the designed value (e.g., an S19 will report an error if one hashboard is missing). Some new firmware will force a shutdown.
  • Typical log: has boards incomplete… shutting down…
  • Core cause: Poor or damaged hashboard cables. Some firmware models restrict “running with missing boards.”
  • Troubleshooting steps:
    1. Replug and reconnect the hashboard cables after powering off to ensure the connectors are free of oxidation.
    2. Replace the cables with new ones to rule out cable aging.
    3. If temporary operation is required, try flashing the older firmware (compatibility must be confirmed), but the faulty board must be repaired as soon as possible.

4.Power Supply and Power Supply Error Codes: The Underlying Guarantee of Stable Operation

Power supply anomalies are the primary cause of cascading failures, and error messages differ between air-cooled and liquid-cooled models.

1. ERROR_POWER_LOST (Power Loss/Abnormal)

  • Code Meaning: Power voltage fluctuation, interruption, or abnormal status. Log descriptions differ between air-cooled and liquid-cooled models.
  • Typical logs:
    ◦ Air-cooled models: ERROR_POWER_LOST: Power voltage rise or drop, pls check!
    ◦ Liquid-cooled models: Chain avg vol dropped from 1990 to 2.45
  • Core Causes:
    ◦ Loose power connector or damaged power cord;
    ◦ Poor contact due to loose screws on the power supply copper bar;
    ◦ A fault in the power supply itself (such as a power protection trigger in the APW9/APW12 series).
  • Troubleshooting steps:
    1. Check the power control cable and copper bar screws to ensure they are securely fastened.
    2. Use a multimeter to test the power supply output voltage and compare it to the miner’s specifications (e.g., the S19 requires a stable 12V output).
    3. Replace a backup power supply of the same model and test again. If the fault is confirmed, repair or replace the power supply.

5. Fan and Cooling System Error Codes: The Key to Temperature Balance

Fan failure directly triggers high-temperature protection and is a “fault signal” for the most vulnerable components in a mining machine.

ERROR_FAN_LOST (Fan Lost/Abnormal Speed)

  • Code Meaning: The fan is not running or the speed is below the threshold (usually <1500 rpm). This is common in multi-fan models (such as the S9’s 3-fan system).
  • Typical log: ERROR_FAN_LOST: fan 1 speed 0 rpm.
  • Core Cause: Loose fan cable, damaged motor, or faulty control board interface.
  • Troubleshooting Steps:
    1. Check that the fan cable is securely plugged in and is not broken or has oxidized connectors.
    2. Replace a fan with the same specifications and test. If normal operation is restored, the fan is faulty.
    3. If the error persists after replacing the fan, upgrade the firmware or replace the control board.
    4. For multi-fan models, perform a factory reset to eliminate firmware compatibility issues.

 

Leave a Reply