I can show you a solution that I suspect is $O(n \log n)$, but I haven't proven the runtime. Edit: Following a linked solution, I amended this solution to be simpler and run in $O(n \log n)$.
So the issue is that if you take a greedy approach, you might choose a 'bad' segment that blocks a favorable choice later. That is, choices you make at the beginning affect what choices you have available later on. This suggests a dynamic programming approach.
First step is to sort the segments by their start times. Then, you need to define exactly what you're going to calculate. We'll define $numsegs(i)$ as the max number of segments possible when segment $i$ is included and has the earliest start time of all included segments.
Then your final answer is going to be the maximum of value of $numsegs(i)$ for all possible values of $i$. Now you need a formula for $numsegs(i)$.
Think about how to calculate $numsegs(0)$, the maximum segment count when you include the first segment. If we include the first segment, then we can only include other segments that begin after the first segment ends. So we need to find the index of the first segment that is compatible with segment 0. We can do this in $O(\log n)$ time with a simple binary search - let that index be $k$. Now, we can say that $numsegs(0)$ is 1 (for segment 0) plus the maximum of $numsegs(k), numsegs(k+1), \dots, numsegs(n)$. (There's an edge case to consider as well. If segment $i$ doesn't have any compatible segments after it, then $numsegs(i)$ is just 1.)
You can compute $numsegs(i)$ in a very similar way. Note that $numsegs(i)$ depends only on $numsegs$ values greater than $i$. Therefore we can compute $numsegs(n-1)$ first, then $numsegs(n-2)$, and so on, caching the values as we go, and we'll never have to recalculate any $numsegs$ values.
This solution is actually $O(n^2)$. To compute $numsegs(i)$, you have to find the maximum among potentially $n-i$ following $numsegs$ values. How then to get the total runtime down to $O(n \log n)$?
Edit: As a linked solution pointed out, we can compute the maximum quickly using caching. Suppose we want to know the maximum $numsegs$ value starting from index $k$. We can just compare $numsegs(k)$ to the maximum starting from $numsegs(k+1)$, which we cached in a previous step.