3

I know that in order to derive $f(x) = x^x,$ you have to take the log of both sides first and then derive it to get $f'(x) = x^x(ln(x)+1).$ I know that if you take the derivative directly using the chain rule, you get the wrong answer. Why is this? I assume it has something to do with the fact that the definition of the derivative has $h\rightarrow 0$ and we would potentially have $(x+h)^{x+h} \rightarrow 0^0$ somehow (which is indeterminate form), but I'm not immediately seeing this.

EDIT: So as has been pointed out, this question has been answered elsewhere. Additionally, I did not provide enough information. Specifically: how am I applying the chain rule to get the wrong answer? (And I suppose 'What wrong answer am I getting?')

As it turns out, I couldn't decide which function is the "outside" function and which function is the "inside" function. Initially, I did the power rule first to get $$f'(x) = x\cdot x^{x-1} \cdot (x^x\cdot \ln(x))= x^{2x}ln(x),$$ which we've seen is wrong.

Bark Jr.
  • 619
  • 10
    The Chain Rule tells you that $(g \circ h)'(x) = g'(h(x)) h'(x)$. What do you propose to take for $g$ and $h$ here? – Travis Willse May 27 '16 at 15:36
  • 5
    You can use the multivariable chain rule with $f(x,y)=x^y$, $x(t)=t$, $y(t)=t$ to differentiate $f(x(t),y(t))=t^t$. –  May 27 '16 at 15:42
  • 10
    "if you take the derivative directly using the chain rule, you get the wrong answer. Why is this?" The problem is that you're applying the chain rule incorrectly. Exactly what the error is we can't say without seeing your work - none of us has any idea what you mean by using the chain rule directly here... (Travis already said this, thought it wouldn't hurt to be a little more explicit) – David C. Ullrich May 27 '16 at 15:46
  • @Travis: You have identified the exact problem of OP here. Unfortunately the answers to duplicated question do not focus on this problem and instead rely more on the multivariable chain rule. – Paramanand Singh May 28 '16 at 03:22
  • 1
    (to differentiate) – reuns May 28 '16 at 03:28
  • It is still not clear how you found your $f'(x)$. The $x \cdot x^{x-1}$ part is clearly trying to apply the power rule. This is not fair because the power rule applies to $x^y$ for fixed $y$, but that is the heart of your question. Why do you have the rest of the factors there? We need to understand your thinking to explain where the problem is. – Ross Millikan May 28 '16 at 04:15
  • I agree with @ParamanandSingh. I am surprised how complex many have made finding the derivative of $x^x$ to be. There is no need to use the multivariable chain rule. Just write down the meaning of $x^x$ and use the usual single variable chain rule. –  May 28 '16 at 07:22

2 Answers2

7

'Why can we not just use the chain rule to derive $f(x)=x^x$?' You can.

'I know that if you take the derivative directly using the chain rule, you get the wrong answer.' This is not true.

You do not need to apply any 'tricks' by taking $\log$ of both sides. I suspect you are trying to take the derivative of $x^x$ without know the meaning of the function. $f(x) = x^x$ is just shorthand notation for $f(x) = e^{x\log x}$ (and note this function is only defined for $x > 0$). To emphasize, $e^{x\log (x)}$ is the definition of the function that is usually written as $x^x$. Once you write down the correct definition of $x^x$, finding the derivative with the chain rule is straightforward:

$$ f'(x) = e^{x\ln x}(1+\ln (x)) = x^x(1 + \ln (x)). $$

AsukaMinato
  • 1,007
  • 3
    Perhaps I misunderstand OP's intent, but this doesn't seem to address the question as written: Writing $x^x = e^{x \ln x}$ is tantamount to taking "the log of both sides first", so this does not answer why one must apply taking-logarithms-first or some other technique (like Rahul's suggestion in the comments), or why taking "the derivative directly using the chain rule [gives] the wrong answer". – Travis Willse May 27 '16 at 15:46
  • 4
    Writing $x^x = e^{x \ln x}$ is just writing down the definition of the function $x^x$. –  May 27 '16 at 15:53
  • 3
    Regardless, this simply does not address the question, which was why can't we apply the chain rule "directly". Showing how to differentiate the function doesn't say what's wrong with something else. (Not that it's possible to answer the question without seeing the wrong calculation the OP has in mind...) – David C. Ullrich May 27 '16 at 16:28
  • It actually does address the question. It shows there are no tricks of any kind needed to differentiate and no problem with using the chain rule. –  May 27 '16 at 16:30
  • I have edited my answer to make it more clear how it addresses the OP. –  May 27 '16 at 16:34
  • And to add a little more color: The notation $x^x$ is simply shorthand for $e^{x ln x}$. It's not that you "need to apply logs first", as though doing so is a kind of algebraic technique to help you solve the problem; you should treat $x^x$ and $e^{x lnx}$ as identical. – PeterJL May 27 '16 at 16:44
  • @ZacharySelk The OP's question was 'Why can we not just use the chain rule to derive $f(x)=x^x$?' I answered clearly, showing that in fact you can use the chain rule, once you know the meaning of $x^x$. Since the OP did not know the definition of the function, of course the chain rule did not work! –  May 28 '16 at 06:41
  • I have edited the answer again to spell it out more clearly for those who do not understand why my answer is answering the OP. What you do not seem to understand is that the OP does not even know the meaning of $x^x$. By explaining the definition of the function, highlighting the fact that he did not know the definition in the first place, is answering his question why the method did not work. –  May 28 '16 at 06:58
  • @ZacharySelk 'The answer shouldn't just be a correct way, but to explain why OPs way is incorrect.' I thought the OP was bright enough to understand that the missing piece was what does $x^x$ even mean in the first place. By giving the definition, I thought it was clear enough for the OP to understand why his method did not work. I have spelled it out more explicitly now. –  May 28 '16 at 07:17
  • 1
    Reading various comments here I think there is only one main objection to your answer. It is that you don't really know what OP means by "directly using chain rule". Unless you know how OP is trying to apply chain rule it is difficult to say where he went wrong. The duplicated question has OP's procedure given clearly and it is very easy to say that he is applying chain rule incorrectly. Maybe you can just add that the only way to think of $x^{x}$ as a composite function $g(h(x))$ is to build this function from existing functions $\exp$ and $\log$. Contd.. – Paramanand Singh May 28 '16 at 07:38
  • 1
    and then we easily see that $x^{x}$ can be written as $\exp(x\log x)$ and our job is done. Also I agree that the simplest way to define symbol $x^{x}$ is to write it as $\exp(x\log x)$, but there are other ways to define it also (which I think are hard enough compared to $\exp(x\log x)$). I believe that is also another reason for the disagreement here with your answer. – Paramanand Singh May 28 '16 at 07:41
  • I reversed my downvote when you explained your post more. I think it is helpful now. –  May 28 '16 at 08:06
3

Think of the function $y=x^2$. We know that the derivative is $2x$.

However, $y=x^2$ can also be written as $y=x*x$. If you choose a constant $a$ to be equal to $x$, then $y=ax$. Therefore, the derivative is $a$, and therefore is $x$.

That is clearly incorrect. We know it to be $2x$. The error you are making is somewhat similar to the one I just demonstrated. You are treating one of the $x$s as a constant, but it is a variable.

EDIT: I'm really sorry if that confused you more. I just reread it and couldn't tell what the heck I was getting at when I wrote that.

Basically, as people said in the comments, you can not use the chain rule if it is still in the form of $x^x$. You must rewrite it as $e^{x\ln{x}}$.

To rewrite what Travis said in my own words, the chain rule states that if $h(x)=f(g(x))$, then $h'(x)=f'(g(x))*g'(x)$. The problem is that with $x^x$, you can't really define anything as $f$ or $g$. (The best I could come up with is $f(x)=x^x$ and $g(x)=x$, but that doesn't get me anywhere.)

However, if you rewrite it as $e^{x\ln{x}}$, you can say that $f(x)=e^x$ and $g(x)=x\ln{x}$. Then you can use the chain rule.

$$h(x)=f(g(x))=e^{x\ln{x}}$$

$$h'(x)=f'(g(x))*g'(x) = e^{x\ln{x}}*\left(x*\frac1x+1*\ln{x}\right)$$

$$=x^x\left(1+\ln{x}\right)$$




I really feel like I'm just copying what other people said, but I feel like that is the only way to make my answer actually helpful. If I'm breaking rules or etiquette, please tell me.

Polygon
  • 2,343