Week 3 - BALT 4363 - Handling and Cleaning Data with Python Libraries

As I continue to learn about the workings of Python, I am again struck by the functional similarities between this computer programming language and Microsoft Excel. These tools can consequently be leveraged, given a particular situation, to maximize both efficiency and meaningful outcomes. When it comes to automation, I personally don’t believe that Python produces quicker results than Excel when data cleaning or basic, functional data tasks are required. There is most likely an equal amount of time spent writing functional code for Python and creating a function or table in Excel in this instance. In the example shown in the picture above, the code necessary to produce the mean global sales for each genre of video game took about as much time, if not more, to write than the time it took me to produce the same results through the PivotTable function in Excel.



Despite the advantage Excel offers when simple functions and smaller amounts of data are involved, Python maintains the upper hand, in my opinion, when circumstances require the configuration of large amounts of data. This is especially true when complex analyses are needed. The scalability and advanced analytics offered by Python are likely to support these large, complex datasets and ensure efficiency when deriving valuable insights from said data. These features are unlike those of Excel, which suffers from limitations in terms of storage and complex structuring.

It was also interesting to learn more about Python’s utilization of libraries such as Pandas and NumPy. These modules allow the incorporation of pre-written code into various coding tasks. Consequently, their use reduces the time spent repetitively retyping lines of code needed to perform certain tasks. This then enables a user to allocate a greater amount of time toward conducting any complex tasks necessary to produce a desired outcome.

Comments

Popular posts from this blog

Week 1 - BALT 4363 - Introduction

Week 2 - BALT 4363 - Python Data Manipulation