0

I have an SQL database, and there are two ways I can connect to it:

  1. Using Azure Data Studio and running SQL commands there (either using a .sql script file, or using .ipynb notebook). The pro of this is that you get some SQL syntax highlighting and corrections, auto-complete, etc.. Also, this seems to work in the cloud. Heavy operations don't run on my local computer. The con is that it feels a bit slower and cumbersome, probably due to network latency.

  2. Using my local python, connecting to the database and doing manipulations in python (e.g. using .ipynb in jupyter lab) and saving the results back in the server. The pro of this is that it reacts faster since it works locally. The con is that there are no syntax highlight or auto-complete for column names, etc. Also, this seems to work locally, and heavy operations make my CPU work hard. Maybe if the data is very big this can also create a memory problem, but maybe there's also a way to work directly with the server.

I was wondering which one should I use? Are there more considerations to take which I might overlooked? Right now for things to be more orderly I thought it might be better to do all the feature engineering stuff in SQL, and then use python for analysis and maybe some ad-hoc manipulations. Is this reasonable?

0 Answers0