Let me try to give you a different perspective on this. I'm a Site Reliability Engineer. My day job is managing Google's production systems. One of my responsibilities is provisioning and capacity management - that is, my colleagues and I decide how much hardware we need to run our services, and where we need it.
In order to do this, we need to know, calculate, or empirically evaluate all of the following pieces of information:
- How many requests (per second) will we need to process?
- How big will each request be?
- How much does a request cost to process, in terms of how big it is?
For the most part, we evaluate all three of these questions through measurement and load testing - not with Big O. However, sometimes we also need to consider:
- Are all requests the same size, and if not, what happens to really big requests?
If our service is $O(1)$, then the answer is "We process them exactly the same as the small requests." If our service is $O(N)$, then the answer is "Each large request is equivalent* to some number of small requests based on its size, so we just need to increase the overall number of requests that we can process, based on the distribution of request sizes that we observe." But if our service is slower than $O(N)$, then we start getting concerned about this question. For example, a $O(N^2)$ service might slow to a crawl if you throw a small number of really big requests at it, whereas it could easily handle a large number of small requests. So now we need to test the service with large requests, which might not have been strictly necessary for a $O(N)$ service, and explicitly account for the overall distribution of request sizes in a much more careful and rigorous fashion than for a $O(N)$ service. Alternatively, we might need to redesign the service, stick a caching layer in front of it, or apply some other mitigation so that we can "pretend" the service is $O(N)$ or faster.
The coefficients are, frankly, not very useful in this context. We're already measuring that information empirically anyway. You might wonder if we could somehow theoretically predict the results of our load tests and benchmarks, and thereby avoid doing them, but in practice, this turns out to be very difficult due to the high complexity of distributed systems. You would need to accurately model and characterize the performance of a lot of moving parts, most of which were not explicitly designed to provide real-time guarantees of performance. Given that empirical load testing is a straightforward and standardized process, it's simply infeasible to try and do this theoretically.
* A linear function may have a non-zero y-intercept, which makes this equivalence technically false. But in most cases, the y-intercept is not large enough for this to make a practical difference. If it is large enough to matter, then we account for it as a separate component of the overall cost. Similarly, very few services are actually $O(1)$, because at an absolute minimum, you need to unmarshal the request, which is itself a $O(N)$ operation, but in practice, this is very cheap and may not be worth accounting for.