NAS542 stuck in a boot loop
footstep
Posts: 6
Our litte daughter felt in love pressing the power button on the NAS542. I don't know how many times she did this, but since then the NAS doesn't boot up properly anymore. After some time I hear the harddrives spinning up and suddenly they stop. This repeats infinitly until I cut the power.
After contacting the Zyxel Support, I've received the files to create a recovery stick. Unfortunatly this didn't work, so I've created another stick to gain telnet access to the NAS. Based on my investigations, the script is failing on the following command: CURR_BOOTFROM=`${FW_PATH}/sbin/info_printenv curr_bootfrom | awk -F"=" '{print $2}'`
When running this command manually, I receive the following error:
/firmware/sbin/info_printenv curr_bootfrom
envfs: wrong magic on /dev/mtd2
The Zyxel Support told me to send the NAS for repair, but I'm wondering if there's another way to fix this issue. Does someone have a clue?
After contacting the Zyxel Support, I've received the files to create a recovery stick. Unfortunatly this didn't work, so I've created another stick to gain telnet access to the NAS. Based on my investigations, the script is failing on the following command: CURR_BOOTFROM=`${FW_PATH}/sbin/info_printenv curr_bootfrom | awk -F"=" '{print $2}'`
When running this command manually, I receive the following error:
/firmware/sbin/info_printenv curr_bootfrom
envfs: wrong magic on /dev/mtd2
The Zyxel Support told me to send the NAS for repair, but I'm wondering if there's another way to fix this issue. Does someone have a clue?
0
Accepted Solution
-
I had to erase the flash first using: /sbin/flash_erase /dev/mtd2 0 0. Then I've rewritten it using the file you've provided and now /firmware/sbin/info_printenv is finally returning readable output.
I'm now looking into updating the firmware to correct the checksums in the env partition.
Thanks you very much for your help.
0
All Replies
-
Hm. I think I know what went wrong. The NAS has a mechanism to prevent bad flashes, which is called 'double flash'. All flash partitions which are updated by a firmware flash are doubly ended. You flash the half which is not 'current', and when that succeeded the 'curr_bootfrom' variable is updated, so that it will boot the other half next boot. When the flashing fails this flag is not updated.But there is more. The bootscript also detects if last boot failed, and then it switches that variable and reboots, to fall back on the previous firmware, to protect you against bad firmware.Now I think your daughter has triggered the 'last boot failed' safety, and cut the power while the 'cur_bootfrom' variable was updated, corrupting the u-boot environment.I *think* you can repair that magic by simply updating some variable, for instance that 'cur_bootfrom'. On my NAS540 the value is 2:/firmware/sbin/info_setenv cur_bootfrom 2It is possible that it will drop you in the previous firmware, of course. It is also possible that there will be some severe damage in the u-boot environment, keeping it from booting.The environment on my 540 is:/firmware/sbin$ info_printenv
ip=dhcp
eth0.serverip=192.168.1.70
kernel_loc=nand
rootfs_loc=nand
uloaderimage=microloader-c2kevm.bin
bareboximage=barebox-c2kevm.bin
mfg_kernel_img=uImage_MFG
mfg_rootfs_img=rootfs_ubi.img_MFG
rootfs_type=ubifs
rootfsimage=root.$rootfs_type-128k
kernelimage_type=uimage
kernelimage=uImage
spi_parts=256k(uloader)ro,512k(barebox)ro,256k(env)
spi_device=spi0.0
nand_device=comcertonand
nand_parts=10M(config),10M(kernel1),110M(rootfs1),10M(kernel2),110M(rootfs2),-(reserved)
rootfs_mtdblock_nand=2
autoboot_timeout=3
usb3_internal_clk=yes
bootargs=console=ttyS0,115200n8, init=/etc/preinit pcie_gen1_only=yes
bootargs=$bootargs mac_addr=$eth0.ethaddr,$eth1.ethaddr,$eth2.ethaddr
next_bootfrom=2
curr_bootfrom=2
kernel_mtd_1=4
sysimg_mtd_1=5
kernel_mtd_2=6
sysimg_mtd_2=7
MODEL_ID=B103
fwversion_1=V5.04(AATB.0)
fwversion_2=V5.11(AATB.2)
revision_1=46843
revision_2=49397
modelid_1=B103
modelid_2=B103
core_checksum_1=32768bcdcd9677274d4af1c02f41dda6
core_checksum_2=0eaa12517d117ff7dd2f68502b7f961d
zld_checksum_1=dbdacfd6dd97dad4787d514f7cdaa496
zld_checksum_2=44485b00ede541d4f27db02f0da490f9
romfile_checksum_1=8D7D
romfile_checksum_2=28C8
img_checksum_1=2dbaf250ef4e9574d28a0340379f831a
img_checksum_2=83d14a443096a8284b07e3f3a91b1673
serial_number=S140Z45007917
ethaddr=5C:F4:AB:5C:58:FC
eth2addr=5C:F4:AB:5C:58:FD
change_boot_part=0As you can see it is not possible to know all values, as it contains md5sums of installed firmware blobs. Don't know what happens if these don't fit. (Well, I know for img_checksum_X, it will on each boot pull a fresh copy of the on disk installed firmware from flash) It also contains the MAC addresses. Not all variables are important, but at least the 'nand_parts' and 'spi_parts' are. Without them the box can't boot.Would ZyXEL repair this under warranty?0 -
Thanks for your answer. Based on your other helpful entries in this forum, I think I've reached the right person :-)
Unfortunatly I cannot set any values, because the command gives me the same error:/ $ /firmware/sbin/info_setenv cur_bootfrom 2
envfs: wrong magic on /dev/mtd2Based on the recovery script, /dev/mtd2 is the barebox env partition. The script also includes a section to rewrite this partition, but the support was unable to provide me the required barebox_env file. But I also don't know if that could do more harm than good.
As I bought the NAS back in 2016, I don't think that it's still under warranty.
0 -
At least I've found a workaround to boot the NAS (without any disks at the moment):
- Boot it with your universal_usb_key_func-2015-10-12 (network and telnet)
- Connect by telnet
- Change to root using su
- Use vi to change the following lines in /etc/init.d/rcS
#ubiattach -m ${IMG_MTD} -d ${IMG_MTD}
#mount -t ${NAND_FS_TYPE} -o ro ubi${IMG_MTD}:ubi_rootfs${CURR_BOOTFROM} ${NAND_PATH}
ubiattach -m 5 -d 5
mount -t ubifs -o ro ubi5:ubirootfs1 /firmware/mnt/nand - Remove the USB stick
- Run /etc/init.d/rcS
0 -
I pm'd you a download link to a dump of my nand partition (or at least I think I did, the forum software is confusing),which I created withnanddump /dev/mtd2 | gzip >nas540.mtd2.gzI think you should be able to write it withcat nas540.mtd2.gz | gzip -d | nandwrite /dev/mtd2If that fails things got worse, as without any environment the box won't boot at all. So if you hesitate I think it should be possible to automate your work around.If you flash this, and the box boots with it, you'll have to change the modelid_1 and modelid_2, the 542 has B403, and perform an update to get the checksums right. Further it's neat to change the MAC addresses to what it should be, but it's not necessary. Odds are low that your NAS will ever be in the same LAN as mine.But re-reading your comment I think you mean that the barebox_env file could be part of an update blob. If that is true, I can extract it. Where did you read that in which script? It seems a bit strange to me as the MAC addresses are also stored in the barebox env, but possibly they are backed up before overwriting.
0 -
I took the risk and have written your mtd2 to my NAS. It didn't fix the wrong magic error, but at least the NAS is still booting using the USB stick.0
-
I had to erase the flash first using: /sbin/flash_erase /dev/mtd2 0 0. Then I've rewritten it using the file you've provided and now /firmware/sbin/info_printenv is finally returning readable output.
I'm now looking into updating the firmware to correct the checksums in the env partition.
Thanks you very much for your help.
0 -
good afternoon, at the moment I have the same problem, can you post the mtd2 dump?
0 -
I pm'ed you a downloadlink.
0 -
Dear Mijzelf many thanks for the file. now your script has started working, but it is not clear what is meant here about the model? running on NAS 542
currently hanging on such a message+ file_model=B403+ echo -n 'board_model=(B103), file_model=(B403) ... 'board_model=(B103), file_model=(B403) ... + '[' xB103 == xB403 ']'+ echo 'NOT equal! /firmware/sbin/mrd_model -s B403'NOT equal! /firmware/sbin/mrd_model -s B403+ error_exit0 -
See above. The u-boot environment contains the board_model. My nand dump is from a NAS540, which has B103. Apparently you have a 542, board_model B403. So to flash 542 firmware you have to change the u-boot environment.
0
Categories
- All Categories
- 415 Beta Program
- 2.3K Nebula
- 141 Nebula Ideas
- 94 Nebula Status and Incidents
- 5.6K Security
- 218 USG FLEX H Series
- 262 Security Ideas
- 1.4K Switch
- 71 Switch Ideas
- 1K Wireless
- 39 Wireless Ideas
- 6.3K Consumer Product
- 245 Service & License
- 382 News and Release
- 81 Security Advisories
- 27 Education Center
- 8 [Campaign] Zyxel Network Detective
- 3.1K FAQ
- 34 Documents
- 34 Nebula Monthly Express
- 83 About Community
- 71 Security Highlight