0

How to treat “simultaneous” subscript and superscript operations in the expression below?

$$ W^{T}_{(k,\cdot)} \tag 1$$

There are two ways to read the statement:

  1. Take the row k from matrix W and then transpose it to the column.
  2. Transpose matrix W and take the row k.

Taken from the article word2vec Parameter Learning Explained. I assume the first one in correct from the context, but is there any source (textbook, etc.) which states/confirms this, defining the order of application of subscript and superscript operations?

Bill Dubuque
  • 282,220
  • 3
    To answer your question: In general it is on the author to be clear so there is no ambiguity. Here either your 1. and 2. are both seemingly reasonable interpretations of what the author Xin Rong meant [I didn't go through the link] and so it was on the author Xin Rong to explicitly state whether he meant your 1 or your 2. – Mike Jun 13 '24 at 23:32
  • 2
    It's a bit like the notation e.g., $x/y \cdot z$. Do I mean $x/(y \cdot z)$ or rather $(x/y)\cdot z$? I am sure somewhere out there, there is a precise rule of how operations are ordered but I think most of us forget, I should just add the right brackets so that there is no confusion e.g., write one of $x/(y \cdot z)$ or $(x/y)\cdot z$, depending on which one I meant. – Mike Jun 13 '24 at 23:36
  • @Mike, thank you for the input. So, are you saying that in math there is no such a rule like priority of these operations like "superscript first, then subscript" and no such thing as Operator associativity(https://en.wikipedia.org/wiki/Operator_associativity) like we have in program languages? For example, the wiki page above says "In order to reflect normal usage, addition, subtraction, multiplication, and division operators are usually left-associative" which resolves the issues in your example, although I understand you and this "usually" here. – Damir Tenishev Jun 14 '24 at 20:36
  • 2
    It is ambiguous: if we write function application in postfix script form then $$(x)^f_g,\ \text{can denote either }, ((x)f)g\ \ {\rm or}\ \ ((x)g)f\qquad\qquad $$

    There are no conventions (widespread operator (sub/sup)script precedence rules) that serve to disambiguate such expressions. $\ \ $

    – Bill Dubuque Jun 21 '24 at 20:15
  • @BillDubuque, thanks. So, in case the meaning could be learned from the context (other operands of row/column type), is it still a mistake from the author or in such a case this could be considered as "possible relaxation of rules" and could be allowed? – Damir Tenishev Jun 21 '24 at 21:31
  • 2
    Generally such disambiguation requires further source context (unless it is clear that only one possibility makes sense). – Bill Dubuque Jun 21 '24 at 22:09
  • 1
    @BillDubuque, of course, I understand this. My question is slightly different. (1) Is mathematician not allowed at all (even having such a context in text) to use such notation, (2) is this an allowed bad practice or (3) this is totally fine if contexts allows to finally find the right reading? To put in a nutshell, the author must/should or could use parenthesis or other way of writing here? – Damir Tenishev Jun 22 '24 at 13:51

1 Answers1

4

To quote different commenters from the deleted duplicate, as the comments are not merged:

In general it is context dependent. If, as you say, context here suggests the first then it is the first for this specific case. If a different context suggests it should be the other then it is the other there. If context is unclear then it is ambiguous. Ideally nothing should have been written in an ambiguous way and it should have been $(W_{(k,\cdot)})^T$ or similar to make it clear.

The author of that article could have clarified the order of the operations with little effort. I tend to stop reading low quality literature that give me such unnecessary headaches.

Math notation, no matter how basic, is not equivalent to programming. We are humans and not compilers. That paper is a piece of crap.

Everyone can have their own opinion about a given paper. Let's not get sidetracked by secondary questions. Reading this paragraph before the author's equation $$\mathbf{h}=\mathbf{W}^\top\mathbf{x}=\mathbf{W}^\top_{(k,\,.\,)}$$ again: $\mathbf{h}$ is an $N$-column vector, $\mathbf{W}^\top$ is an $N\times V$-matrix, $\mathbf{x}$ is the $V$-column vector of zeroes and ones that has a one only at its $k$-th element. What does this mean about the order of operations in $\mathbf{W}^\top_{(k,\,.\,)}\,?$

(Posting comment answers as community wiki)

peterwhy
  • 22,930