User Tools

Site Tools


doc:howto:generic.failsafe

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
doc:howto:generic.failsafe [2013/02/14 07:04]
buntalo
doc:howto:generic.failsafe [2016/04/24 18:49] (current)
stilez
Line 1: Line 1:
 +====== OpenWrt Failsafe ======
 +OpenWrt SquashFS-Images have a built-in failsafe mode. Booting into failsafe mode bypasses all configuration located on the [[doc:​techref:​filesystems#​JFFS2]] partition (the writable '​overlay'​ filesystem),​ and instead uses a basic set of hard coded defaults located on the [[doc:​techref:​filesystems#​SquashFS|SquashFS]] partition (that is the read-only partition containing the router OS). 
  
 +Failsafe mode **can** be used to fix a router which cannot be accessed in the usual ways because of a problem with configuration such as locked out users, locked out network connections,​ broken startup scripts, broken packages or configurations,​ full JFFS2 storage (or other JFFS2 content). It normally **cannot** fix more fundamental problems such as '​[[doc:​howto:​generic.debrick|hard bricking]]'​ or issues with the hardware, kernel or squashFS images that prevent the router booting properly or making connections at the hardware level.
 +
 +Failsafe mode can be triggered using three special procedures while the router boots - waiting for a flashing LED and pressing a button, waiting (with a packet sniffer) for a special broadcast packet and pressing a button, or watching for a boot message (on the serial port) and pressing a key ("​f"​) on the serial keyboard. Usually watching for a flashing LED is easiest. Whichever trigger you use, the router enters failsafe mode and you can access the command line with telnet (always possible) or a serial keyboard. The procedures are described here, as well as useful tips once you get into failsafe mode.  ​
 +
 +Once failsafe mode is triggered, the router will boot with a network address of **''​192.168.1.1/​24''​** on the **''​eth0''​** network interface, and with only essential services running. If your device has multiple network interfaces (eth0, eth1, ...), usually eth0 is the interface connected to the [[doc:​hardware:​switch]] (there may be very seldom exceptions). Using telnet or a serial connection you can mount the JFFS2 partition with the command ''​**mount_root**''​ and diagnose or fix the problems on the JFFS2 partition.
 +
 +For more information,​ [[doc:​techref:​flash.layout#​partitioning.of.squashfs-images|OpenWrt Flash Layout]] explains why OpenWrt failsafe is possible, and [[doc/​techref/​process.boot|Boot Process]] explains how it works (basically OpenWrt contains an additional boot up stage, called preinit).
 +
 +**→ [[doc:​howto:​generic.debrick]]**
 +
 +===== Prerequisites =====
 +  * <color red>Your device must have **at least one configurable hardware button to use flashing LED or broadcast packet triggering**</​color>​. Any configurable button will work (except the main power on/off!), even if it is labelled as a "​reset"​ button or a "wifi on/​off"​ or something else. If your router has any button that you can physically press and release, it's likely to be configurable. Check if there'​s specific info about failsafe mode for your [[toh:​|device]] and make sure everything still works as expected everytime you update!
 +  * Your device must have an OpenWRT firmware with a [[doc:​techref:​filesystems#​SquashFS|SquashFS-Image]] partition flashed to it. Failsafe cannot be implemented on a system based on JFFS2-Images
 +  * The hardware ports you will use for commands when you are in failsafe mode must work (either a valid ethernet port or valid serial connection)
 +  * Of course you must have a networked or serial-connected device (PC, notepad etc) and for serial connection, you must be able to send key-strokes to the router, to use in failsafe mode!
 +  * The boot process must be able to succeed. Failsafe mode can be used to fix any problem in the [[doc:​techref:​filesystems#​JFFS2]] partition (because it doesn'​t need this to enter failsafe mode), but it needs the kernel partition and the [[doc:​techref:​filesystems#​SquashFS|SquashFS]] partition to be able to support the boot process, so that...
 +    * ...the boot process is able to get as far as required to register the pressing of the button
 +    * ...the minimal required binaries and the firmware'​s default configuration files are available and the boot process can successfully enter failsafe mode (these are all on the SquashFS)
 +
 +
 +===== How to trigger failsafe mode =====
 +
 +You can trigger failsafe mode in three ways:
 +  - Watching the router LEDs for flashing during boot, and pressing any hardware button when seen **(standard and often easiest)** ​
 +  - Using a packet sniffer to watch for a special broadcast packet during boot, and pressing any hardware button when seen
 +  - Using a serial connection, watching for a message during boot, then pressing the "​f"​ key on your serial keyboard
 +
 +===== Triggering by pressing any hardware button during boot =====
 +
 +==== Stage 1: Router and Computer preparation ====
 +
 +  * Power off the router
 +  * Unplug the WAN port, and in some cases if needed any other ports (if the WAN IP address and LAN IP address are the same or two LAN ports (an address collision), you would not be able to enter failsafe mode unless you unplug the other port(s) which are colliding)
 +  * Set your computer'​s IP to ''​192.168.1.2'',​ subnet mask ''​255.255.255.0''​. The router will be reached at ''​192.168.1.1''​ when failsafe mode is running. (You may also use any other IP in the range ''​192.168.1.2-254''​.)
 +  * Connect the computer to the router. You may need to check which port to use, especially if you plan to watch for a broadcast packet to trigger failsafe (see below)
 +
 +==== Stage 2: Enter failsafe mode ====
 +
 +  * To detect when failsafe mode can be triggered, there are two options: ​
 +    * Look for a bootup LED blink pattern (easiest with many routers). ​ Looking for a blink pattern is often much more convenient than the other options.
 +    * Use a packet sniffer on any computer to listen for a special broadcast packet on UDP port 4919 during boot, then press the front button on the router when this packet is seen. 
 +
 +Immediately when the LED blink pattern or the network broadcast message is seen, click the device button. If your device has multiple buttons, any button should work. OpenWrt is configured in a way, that pressing of any button during preinit will trigger booting into failsafe mode. But in case a button should not work, try another. It can also help to press the button repeatedly until the blink speeds up or the "​success"​ broadcast packet or other evidence of triggering failsafe mode successfully,​ is seen.
 +
 +=== Stage 2 option 1: Entering failsafe mode using a blinking LED on the router ===
 +
 +On many routers, OpenWrt will start to blink a "​SYS"​ LED (may be "​Power",​ may be other) on the front of the router when it is in its early boot cycle. Since [[https://​dev.openwrt.org/​changeset/​44056|r44056]] there are three different LED blinking speeds for most of the routers (in trunk and CC15.05):
 +  * first a moderate 0.1 second blinking rhythm during those two seconds, when router waits for user to trigger the failsafe mode
 +  * then either
 +    * a slow 0.2 second blink if the failsafe was not triggered and the normal boot continues
 +    * a rapid 0.05 second blink if the user pressed a button and failsafe mode was triggered
 +
 +Steps:
 +  * Power on the router.
 +  * As soon as this blink pattern is seen, press any hardware button of the router. ​ At least [[toh:​tp-link:​tl-wdr4300#​failsafe.mode|one TP-Link router]] seems to respond better to repeatedly clicking the button //before //the SYS LED starts to blink, until the SYS LED lights with the rapid-flash pattern.
 +  * The LED will change to faster blink pattern, indicating the router is now in failsafe mode.
 +
 +Some routers only have one hardware button, the reset button, which is often on the back of the unit (often labeled "​Reset"​ or "​WPS/​Reset"​). It may have a visible (external) button, or may be behind a hole (with button in the depth). If it is in a hole, you require a paper clip or similar tool to operate it. Please no not use a nail to press the button in the hole!
 +
 +=== Stage 2 option 2: Entering failsafe using the broadcast packet ===
 +
 +The exact steps will depend on the device you are using to watch for the broadcast packet. Details are given below for Linux and Windows. ​ Most *nix/BSD/OS X/​Android/​Mac should be very similar to Linux (often identical). ​ For many other devices and systems the same steps should be possible (but details not provided).
 +
 +You will need to be sure the router is connected to the device/PC, the cable is working, the device'​s firewall will not block the packet, and that network LEDs or other diagnostics you may have, show the connection is working. You may also need to temporarily disable the firewall on your device or open a port on it - take care and secure it again after!
 +
 +Steps:
 +
 +  * You will need some packet sniffing/​packet capture software: ​
 +    * ''​**Linux (also most *nix/BSD/OS X/​Android/​Mac):​**''​ Software is often built in or very easy to download. ​ GUI ''​**wireshark**''​ or console ''​**tcpdump**''​ or ''​**cshark**''​ or other. If you do not have any, then these are all very common open source ports and available + free on most platforms. Y you should be able to download one of these for your device easily in the usual way (or any other packet sniffer you like). ​
 +    * ''​**Windows:​**''​ You can use the [[http://​downloads.openwrt.org/​people/​florian/​recvudp/​recvudp-win32.zip|recvudp.exe]] utility software, or any other packet sniffer. There are also Windows versions of some of the above software as well.
 +  * Start watching for packets. ​ The exact commands or menu options are different for each sniffer, so you need to find how to do this on your software. The packet will be sent to ''​**destination address 192.168.1.255 port UDP 4919**''​. So for example, in a terminal and using tcpdump, with the router connected to port eth0, you would enter the command ''​**tcpdump -Ani eth0 port 4919 and udp**''​
 +  * Turn the router power off, wait a few seconds, and then back on. 
 +  * Look in the sniffer for the early boot network broadcast packet to be shown (Could take up to 30-40 seconds on some routers). ​ The packet will also show the message that it is waiting for your click on a button. The screenshots below show what this can look like.
 +  * When you see the packet/​message,​ press any configurable hardware key on the router. If needed press several times. Often you will know you have succeeded because the router will send a second broadcast "​success"​ packet when failsafe mode is triggered, and you will see this in the packet sniffer as well (also shown in the screenshots below). ​ But not all versions do this.
 +
 +<​html>​
 +<table class="​inline"​ style="​width:​70%;​ margin-left:​15%">​
 +  <tr>
 +    <td style="​border-left:​6px solid #f57900; vertical-align:​middle">​
 +      <img src="/​_media/​meta/​icons/​tango/​48px-emblem-question.svg.png"​ alt=""​ style="​float:​left;​ margin-right:​0.5em"​ />
 +      <​strong>​Unverified Information!</​strong><​br />
 +      Up to today (Jan 11, 2013) this page didn't precise on which port to listen. In the case of TL-WR1043ND,​ it's the WAN port. If you find a contradictory example, it will be necessarry to <a href="?​do=edit">​remove or adapt </a> this note.
 +    </td>
 +  </tr>
 +</​table>​
 +</​html>​
 +
 +== Screenshots of typical packet sniffer using broadcast packet method ==
 +'''​Broadcast packet and success packet under Linux (broadcast packet is the first part only!):'''​
 +
 +Run ''​wireshark'',​ ''​cshark''​ or ''​tcpdump''​\\
 +{{:​doc:​howto:​linux-failsafe.png|}}
 +
 +'''​Broadcast packet and success packet under Windows (broadcast packet is the first line only!):'''​
 +
 +Monitoring the special packet in a program ''​recvudp.exe''​.\\
 +{{:​media:​failsafe2.png|}}
 +
 +==== Stage 3: Log into the router using failsafe ====
 +
 +Indications of failsafe mode:
 +  * Once in failsafe mode, a network broadcast confirmation message appears (not always, for the TL-WR1043ND no message comes).
 +  * On some router models (e.g. TP-LINK models), the SYS led blinks very quickly
 +
 +If you are using a **trunk** snapshot, revision 46809 or newer, **ssh** to 192.168.1.1 from the computer and log in as root (no password required). The host key will be randomly generated. You can pass ''​-o "​UserKnownHostsFile /​dev/​null"​ -o "​StrictHostKeyChecking no"''​ to ssh if you want to allow a different host key temporarily.
 +
 +If you are using a release image, **telnet** (//not// SSH) to 192.168.1.1 from the computer. There is no username or password required.
 +
 +Now go to section [[#When you are in failsafe mode]]
 +
 +===== Remarks =====
 +
 +  * If the router does not boot in safe mode despite clicking the button, it may be a timing problem, missing the brief window when OpenWrt is looking for a button press. If so, immediately after turning the router on, rapidly click and keep clicking the button on the router for about 60 seconds to try to not miss the safe mode boot window.
 +  * If your router has a ridiculously long boot time (such as [[toh/​d-link/​dir-300#​with.manual.step.by.step.guide|DIR-300 A]]), then you may do this for a longer time.   
 +
 +===== Serial connection triggering by keyboard key combination in a serial console =====
 +  - Unplug the router'​s power cord.
 +  - Connect the router'​s WAN port directly to your PC.
 +  - Configure your PC with a static IP address between 192.168.1.2 and 192.168.1.254. E. g. 192.168.1.2 (gateway and DNS is not required).
 +  - Plugin the power.
 +  - Connect via serial ​
 +  - Wait until the following messages is passing: Press the [f] key and hit [enter] to enter failsafe mode
 +  - Press "​f"​ and the "​enter"​ key
 +  - You should be able to **telnet** (//not// SSH) to the router at 192.168.1.1 now (no username and password)
 +
 +===== When you are in failsafe mode =====
 +
 +==== Login message ====
 +You get a message similar or same like this (using OpenWrt 12.09):
 +
 +|<​code>​
 + === IMPORTANT ============================
 +  Use '​passwd'​ to set your login password
 +  this will disable telnet and enable SSH
 + ​------------------------------------------
 +
 +
 +BusyBox v1.19.4 (2013-03-14 11:28:31 UTC) built-in shell (ash)
 +Enter '​help'​ for a list of built-in commands.
 +
 +  _______ ​                    ​________ ​       __
 + ​| ​      ​|.-----.-----.-----.| ​ |  |  |.----.| ​ |_
 + ​| ​  ​- ​  ​|| ​ _  |  -__|     ​|| ​ |  |  ||   ​_|| ​  _|
 + ​|_______|| ​  ​__|_____|__|__||________||__| ​ |____|
 +          |__| W I R E L E S S   F R E E D O M
 + ​-----------------------------------------------------
 + ​ATTITUDE ADJUSTMENT (12.09, r36088)
 + ​-----------------------------------------------------
 +  * 1/4 oz Vodka      Pour all ingredients into mixing
 +  * 1/4 oz Gin        tin with ice, strain into glass.
 +  * 1/4 oz Amaretto
 +  * 1/4 oz Triple sec
 +  * 1/4 oz Peach schnapps
 +  * 1/4 oz Sour mix
 +  * 1 splash Cranberry juice
 + ​-----------------------------------------------------
 +root@(none):/#​
 +</​code>​|
 +
 +Additional note ([[https://​dev.openwrt.org/​changeset/​42985/​trunk|r42985]]):​
 +
 +|<​code>​
 +
 +================= FAILSAFE MODE active ================
 +special commands:
 +
 +    firstboot reset settings to factory defaults
 +    mount_root mount root-partition with config files 
 +
 +after mount_root:
 +
 +    passwd change root's password
 +    /etc/config directory with config files 
 +
 +for more help see:
 +​http://​wiki.openwrt.org/​doc/​howto/​generic.failsafe
 +=======================================================
 +</​code>​|
 +
 +
 +==== The file systems in failsafe mode ====
 +OpenWrt uses an overlay file system (JFFS2) which overlays the default router files on the [[doc:​techref:​flash.layout|SquashFS]] partition. Deleting a file from the JFFS2 effectively "​resets"​ the JFFS2 file version back to default, because the original file will be seen on the SquashFS (if it existed).
 +
 +The root file system in failsafe mode is the only the SquashFS partition and the JFFS2 is not present. To mount (access) the JFFS2 as read/write in failsafe mode you must manually mount it. Enter the command ''​**[[https://​dev.openwrt.org/​browser/​trunk/​package/​base-files/​files/​sbin/​mount_root|mount_root]]**''​ to do this.
 +
 +Once the JFFS2 file system is mounted read/write, you can view/​edit/​delete/​fix the files which are changed from the default firmware. ​ Any files that are changed will be accessible at **''/​overlay/​*''​** (or **''/​overlay/​upper/​*''​** on some routers). ​
 +
 +The core config files are usually at **''/​overlay/​etc/​config/​*''​** (or **''/​overlay/​upper/​etc/​config/​*''​**) and have names such as "​network",​ "​firewall"​ etc. Other copies may exist in the /rom subdirectory and the router'​s UI code may exist in subdirectories such as /lua
 +
 +==== Useful commands and procedures ====
 +
 +  * In case you forgot your password, you need to set a new one. Type: ''​**passwd**''​
 +  * In case you forgot the routers IP address, get it with ''​**uci get network.lan.ipaddr**'' ​
 +  * The ''​**uci**''​ command reads and writes the router'​s main configuration files, and is also the main command line tool for modifying the configuration. So it has a lot of helpful commands for troubleshooting and fixing config-related problems. See [[doc:​uci|The UCI System]].
 +  * If the problem is full JFFS2 file system (too many packages etc) then delete some or wipe the JFFS2, see below
 +  * If you are not very familiar with Linux, many commands have a ''​**--help**''​ option (for example: ''​grep --help''​) which can suggest the options you need. Often you only need basic commands to get started, such as ''​**mv**''​ (move/​rename),​ ''​**cp**''​ (copy), ''​**rm**''​ (remove/​delete),​ ''​**find -name *XYZ***''​ (find all files from the current dir with XYZ in the name), ''​**cd**''​ and ''​**ls**''​ (change and list current directory), ''​**cat**''​ (view file), ''​**less**''​ (view file with page up/down, use "​q"​ to finish), ''​**grep**''​ (show matching lines/text only), and so on.  If a command "​hangs"​ or takes too long, ''​**ctrl-C**''​ will often kill it and return to a command line
 +  * If you are done with failsafe mode use ''​**reboot**''​ to reboot.
 +
 +==== Changing or resetting some config by editing files ====
 +Run the command ''​**mount_root**''​ and then edit or delete such files as you need. To reset all of the JFFS2 (OpenWrt version of "​factory reset"​) see the next section.
 +
 +The core config files are usually at **''/​overlay/​etc/​config/​*''​** (or **''/​overlay/​upper/​etc/​config/​*''​**) and have names such as "​network",​ "​firewall"​ etc which you can search using the **''​find -name''​** command (see below). If you know your error is (say) some network switch or VLAN issue, then you can edit/delete the network config file and reboot. The router will keep all settings except the settings of the file you changed/​deleted which will go back to default.
 +
 +==== Wiping JFFS2 file system ('​Factory reset' to default config) ====
 +
 +In case you filled up the entire JFFS2 file system by installing too big/too many packages, clean the entire JFFS2 partition. All settings will be reset and all installed packages are removed. (OpenWrt equivalent of a factory reset). Run ''​**mount_root**''​ first (see above) to mount the JFFS2 partition. Once the JFFS2 partition is mounted for read/write, use any of these commands to erase the files on it, which resets the router:  ​
 +  * ''​**firstboot**''​
 +  * ''​**mtd -r erase rootfs_data**''​ (this will reboot the device as part of the process)
 +  * **''​rm -r /overlay/* ''​** (or /​overlay/​upper/​* on some routers)
 +
 +NOTE: there is a [[https://​dev.openwrt.org/​ticket/​20168|bug report]] that sometimes firstboot or mtd-r erase rootfs_data may not work and "​hangs"​. If that happens then the files can be deleted using the "​rm..."​ method. The overlay is "on top" of the SquashFS so deleting overlay files just leaves the original SquashFS files showing.
 +
 +===== Flash new firmware in failsafe mode =====
 +From windows desktop ​
 +
 +1) you have first to download netcat for windows, I used the above site:
 +https://​joncraton.org/​blog/​46/​netcat-for-windows/​
 +
 +2) After you copy the flash file (eg "​flash.bin"​ to the directory where netcat is, open a command propt window. Then go to the folder where netcat is.
 +
 +Then using this command
 +<​code>​ nc -l -p 3333 < flash.bin</​code>​
 +
 +From linux desktop -> router.
 +
 +On linux machine
 +<​code>​cat yourfirmware.bin | pv -b | nc -l -p 3333</​code>​
 +  - pv show progress, nc (netcat) listen on port 3333 transferring the firmware
 +
 +On Router via Telnet
 +<​code>​nc linux.machine.ip.192.168.1.2 3333 > /​tmp/​yourfirmware.bin</​code>​
 +
 +install firmware with current settings.
 +<​code>​sysupgrade /​tmp/​yourfirmware.bin</​code>​
 +
 +
 +===== Notes =====
 +  * the article [[doc:​techref:​process.boot]] may help you better understand when ''​failsafe''​ "kicks in" once activated