One of the best solution is likely based on a linear programming relaxation or direct integer programming. For the latter, the branching and backtracking will be implicit, and you won't have to manage it yourself.
I have seen it solved in two ways using this technique. We can slightly improve your bounding algorithm as well.
The textbook method
Using binary variables $x_{ij}$ representing $f(i) = j$, you can define continuous variables $f(i) = \sum_{j=1}^n j x_{ij}$
Add constraints $\sum_i x_{ij} = 1$ and $\sum_j x_{ij} = 1$ representing that one position is assigned one and only one node.
For each edge, the cost $c_e$ has two constraints : $c_e \geq f(i) - f(j)$ and $c_e \geq f(j) - f(i)$, and you want to minimize $\sum_e c_e$
You can give this problem to an integer programming solver, which will be happy to do the backtracking for you and more - or you can solve the linear programming relaxation yourself each time (just if you want to learn, the solver optimizes it internally).
Another relaxation
Instead of using the positions as binary variables, you can use $f(i) < f(j)$ i.e. node $i$ goes before node $j$. This is more adapted if you want to do the branching yourself and don't specify those variables explicitly.
With this approach, you can sometimes have a faster problem to solve at each node - here it can be solved as a simpler minimum-cost flow problem as shown by this paper. I wouldn't recomment it unless you are willing to investigate much time and research into your problem.
Other techniques
For branch-and-bound, any lower bound on your cost function will do.
For small problems your bounding approach is perfectly fine.
You could make it tighter: for each edge with an unplaced node, pick the best possible free labeling to estimate its cost. Several edges may use the same placement for different nodes, but this will be better than estimating their cost as 0.
There are many possible variations on this scheme: pick the best free label for each unlabeled node (irrespective of overlaps) or consider overlaps only inside small groups of nodes.