I want to give short background about the namenodes and fsimage/edit_logs , and how namenode works in hadoop clusters,
The NameNode stores modifications to the file system as a log appended to a native file system file, edits.
When a NameNode starts up, it reads HDFS state from an image file, fsimage, and then applies edits from the edits log file.
It then writes new HDFS state to the fsimage and starts normal operation with an empty edits file.
FsImage is a file stored on the OS filesystem that contains the complete directory structure (namespace) of the HDFS with details about the location of the data on the Data Blocks and which blocks are stored on which node.
EditLogs is a transaction log that recorde the changes in the HDFS file system or any action performed on the HDFS cluster such as addtion of a new block,
replication, deletion etc., It records the changes since the last FsImage was created,
it then merges the changes into the FsImage file to create a new FsImage file.
When we are starting namenode, latest FsImage file is loaded into "in-memory" and at the same time,
EditLog file is also loaded into memory if FsImage file does not contain up to date information.
Namenode stores metadata in "in-memory" in order to serve the multiple client request(s) as fast as possible.
If this is not done, then for every operation , namenode has to read the metadata information from the disk to in-memory. This process will consume more disk seek time for every operation.
so lets summary
Persistence of HDFS metadata broadly consist of two categories of files:
fsimage
Contains the complete state of the file system at a point in time. Every file system modification is assigned a unique, monotonically increasing transaction ID. An fsimage file represents the file system state after all modifications up to a specific transaction ID.
edits file
Contains a log that lists each file system change (file creation, deletion or modification) that was made after the most recent fsimage.
Checkpointing
is the process of merging the content of the most recent fsimage, with all edits applied after that fsimage is merged, to create a new fsimage. Checkpointing is triggered automatically by configuration policies or manually by HDFS administration commands.
Until now the brief about namenode and edit logs
So lets talk now about our cluster ( its based on HDP version 2.6.5 )
In folder /var/hadoop/hdfs/namenode/current of each namenode , we have the following fsimage files
fsimage_0000000000000031788 100% 104KB 104.1KB/s 00:00
fsimage_0000000000000031788.md5 100% 62 0.1KB/s 00:00
fsimage_0000000000000041641 100% 104KB 104.1KB/s 00:00
fsimage_0000000000000041641.md5 100% 62 0.1KB/s 00:00
also the edit logs ,
.
.
.
-rw-r--r-- 1 hdfs hadoop 328138542 Jan 23 12:37 edits_0000000022056979997-0000000022059239786
-rw-r--r-- 1 hdfs hadoop 301415558 Jan 23 13:07 edits_0000000022059239787-0000000022061345588
-rw-r--r-- 1 hdfs hadoop 311747850 Jan 23 13:37 edits_0000000022061345589-0000000022063490851
-rw-r--r-- 1 hdfs hadoop 12 Jan 23 13:37 seen_txid
-rw-r--r-- 1 hdfs hadoop 330301440 Jan 24 07:10 edits_0000000022063490852-0000000022065448335
Now , we start both namenode ,
In the namenode logs we see that namenode replaying each of the edit log ( so if for example we have 1965 edit_logs then namenode is replaying to all them one by one .....)
Example:
2020-01-27 06:20:37,306 INFO namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 2072759/2282427 transactions completed. (91%)
2020-01-27 06:20:38,307 INFO namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 2214991/2282427 transactions completed. (97%)
So namenode completely started with active/standby state after replaying all 1965 edit_logs ,
And this takes almost 17 hours
So after we restart both namenodes , we expect to get fsimage files up to date
For example:
-rw-r--r-- 1 hdfs hadoop 445716 Jan 31 08:11 fsimage_0000000000000132222
-rw-r--r-- 1 hdfs hadoop 62 Jan 31 08:11 fsimage_0000000000000132222.md5
But in our case after both namenode restart we get this example ( fsimage not update - time from Jan 03 )
-rw-r--r-- 1 hdfs hadoop 445716 Jan 03 07:11 fsimage_0000000000000132222
-rw-r--r-- 1 hdfs hadoop 62 Jan 03 07:11 fsimage_0000000000000132222.md5
So we can see that fsimage was not update , in spite both namenode completely started ( after 17 hours ) and with state of active/standby
Any suggestion why fsimage not update with the current time ?