Levenstein distance and dynamic time warp

Question

I am not sure how to draw parallel between the Wagner–Fischer algorithm and dtw algo. In both case we want to find the distance of each index combination (i,j).

In Wagner–Fischer, we initiate the distance by the number of insert we'd have to do from one empty string to another.

let wagnerFischer (s: string) (t: string) =
   let m, n = s.Length, t.Length
   let d = Array2D.create (m + 1) (n + 1) 0

   for i = 0 to m do d.[i, 0] <- i
   for j = 0 to n do d.[0, j] <- j    

   for j = 1 to n do
       for i = 1 to m do
          d.[i, j] <- List.min [
                           d.[i-1, j  ] + 1; 
                           d.[i  , j-1] + 1; 
                           d.[i-1, j-1] + if s.[i-1] = t.[j-1] then 0 else 1; ]
   printfn "edit matrix \n %A" d 
   d.[m,n]

in the DWT we initiate the boundary at +infinity because we dont want to 'skip' any numbers of the sequence, we always want to match with another item.

What I dont see is what changes between the DWT and the WF algo that prevent use to update the distance in homogeneous way. In DWT we systematically add the cost, whereas in the WF algo, we have this non homegenous function wrt different cases

I understand both algo, but dont make the connexion between those differences in the cost function update .. Any idea to understand the difference intuitively ?

let sequencebacktrack (s: 'a seq) (t:'a seq) (cost:'a->'a->double) (boundary:int->double)  =
   let m, n = s |> Seq.length, t |> Seq.length
   let d = Array2D.create (m + 1) (n + 1) 0.

   for i = 0 to m do d.[i, 0] <- boundary(i)
   for j = 0 to n do d.[0, j] <- boundary(j)

   t |> Seq.iteri( fun j tj ->
            s |> Seq.iteri( fun i si -> 
                        d.[1+i, 1+j] <- cost tj si + List.min [d.[1+i-1, 1+j  ]; 
                                                               d.[1+i  , 1+j-1]; 
                                                               d.[1+i-1, 1+j-1]; ] ))
   printfn "edit matrix \n %A" d 
   d.[m,n]
//does not work
let wagnerFischer2 (s: string) (t: string) =
   sequencebacktrack s t (fun a b -> if a = b then 0. else 1.) (id >> double)

let b = wagnerFischer2 "ll" "la"

score 5 · Answer 1 · answered Nov 06 '12 at 12:43

There is in fact a whole bunch of related algorithms. In modern context I believe "time warp" would be called sequence alignment. Depending on whether you want to match complete strings or optimal substrings one gets Needleman-Wunsch and Smith-Waterman.

In your latter algorithm the costs seem to vary, that is one can attribute different costs for deletion and insertion of a character, as well as for changing characters. Your first algorithm seems to fix these costs to $1$ for all three possible changes.

Levenstein distance and dynamic time warp

1 Answers1